Added rop of pending updates on bot start, reset command, AnswerChat method, GPU offload, limit to response lenght, context reduced to 2048, flash attention, 4 parallel decode queues, --keep of the original 810 tokens (which is the starting prompt)
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
TELEGRAM_BOT_TOKEN=yourTokenHere
|
||||
OPENAI_BASE_URL=http://llm-server/
|
||||
OPENAI_MODEL=Qwen2.5-7B-Instruct-Q8_0.gguf
|
||||
OPENAI_MODEL=Qwen2.5-7B-Instruct-Q8.gguf
|
||||
OPENAI_API_KEY=MyApiKey
|
||||
|
||||
Reference in New Issue
Block a user