Silly tavern response length. I'm running Wizard Mega 13B GPTQ from TheBloke.

Silly tavern response length The AI response configuration section allows you to customize the behavior of the AI and adjust Multigen will stop at the Response Length (tokens) even if there is more text to generate. Virt-io/SillyTavern-Presets · Setting Llama3 response length In the character card, fill out the first message and examples of dialogue in the desired style (for example short chat-style responses without long descriptions). this does not cover everything, but what I believe is enough to make you understand how silly works and how to have a good RolePlay experience even if you do not know how AI works in general. I’ve been using Koboldccp with Silly Tavern, and have been getting slow responses (around 2t/s). I've never had good results with local models, they always take more than a minute to respond and give me like, a third of the response-length I get from poe. In addition, in the AI response formatting section, in the System Prompt and Last Output Sequence, specify your desired response style and length. 69 Rep. It's an understandable misconception, but there's still a good reason why some people would want to keep the token wasteage to a minimum. All reports (in Tavern stat for last message, and Ooba's console) now show that no more then 1680 tokens of context used. Give the character a brief First Here are my settings for Tavern, not necessarily the best. Because the temperature and CFG were Kayra limits each response length to 150 tokens for some reason. Increased the "Response (length)" slider max value to 2k by default and 16k when using the unlocked context option. I'm running Wizard Mega 13B GPTQ from TheBloke. Obviously, need a fair amount of vram for this model. However, if I'm only getting 1 short sentence per response. I don't know if that'll work that great though. I made this small rundown 2 days ago as a comment and decided to make it into a post with more pictures and more info. Set Max Response Length in the AI Response Configuration menu. Clewd Response Length . Reply reply henk717 Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. Do not include anything but the numbered list. this does not cover everything, but what i believe is enough to make you understand how silly works. In Silly Tavern AI, you have the flexibility to configure the responses generated by the AI to suit your preferences. It tends to generate long replies that often hit the limit set by the "Response length" slider, causing the response to But I can give you the settings that I use. If you only want one or two short paragraphs set this to ~160 tokens. Im using the 'Merica!' Jailbreak for clewd, and it seems to be working fine, except the generationd sre far too lengthy. Make sure to use the Instruct version. The list must be prioritized in the order that tasks must be completed. 8 which is under more active development, Quality of the response was good, but oof, waiting 4 minutes for it to generate a response that Poe and other sources have generated in like 15 seconds is kind of a buzzkill, for sure. I adjusted my jailbreak and my settings and I still don't know what I'm doing wrong. 🏞️x9 x2 for cost and a consistent response length. Thank you! My generation settings for both models were the same, base preset is Godlike, but I changed the following: 600 tokens response length, 4k context, 1. Currently, 32GB vram is used by my PC when it is loaded, out of my maximum of 36GB. . I'm using Llama 3 Euryale 70B v2. I'll try out the Lucid 7B 4KS next and see if that's a bit better! I'm using sillytavern with the option to set response length to 1024, cause why not. Here's what I use - turn up the Max Response Length when you're going to trigger it, so it has enough tokens to work properly: ---Prompt Begins--- [Pause the roleplay. Even if you set it to the max it won't do anything. One thing that has kind of helped was putting specific instructions in Author's notes etc - [Write only three paragraphs, using less than 300 tokens] or whatever. Self-hosted AIs are supported in Tavern via one of two tools created to host self-hosted models: KoboldAI and Oobabooga's text-generation-webui. knows the size is 8000 get worried setting it above 4095 Cmon no, there's no repercussions or anything. 8 which is under more active development, Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. (All other sampling methods are disabled) Best part, I looked at their TOS and it seems they only have a problem with touchier stuff like rape, pedophilia, so on. In the advanced formatting I selected Trim Incomplete Sentences. Pretty much, yeah. if you get weird responses or broken formatting/regex, play with the sampler settings. I selected the model from Horde and even though my parameters are all appropriate, and the model is very much available, but Silly Tavern keeps saying, there are no Horde Models to generate text with your request. Honestly, I'm not really sure what can be improved, since as-is, Noromaid blows everything else out of the water. Is it a problem on my end? At the prompt, I usually type [look around] or [look] and I get what I want, but sometimes the response is a couple of paragraphs in length. Response configuration help #165. Add a phrase in the character's Description Box such as "likes to talk a lot" or "very verbose speaker" That's actually good to know. 8Presence Penalty=0. \nThe tasks created should take into account the character traits of {{char}}. It sounds like the model is overflowing from VRAM and hitting system RAM, which will cause a noticeable drop in speed. Additionally, I use around 3. Also, it sometimes doesn't write the response in the format I want (Actions between *'s and dialogue in normal text). If the have the API set up already, Make sure Silly Tavern is updated and go to the first tab that says "NovelAI Presets". Cause the bot to make a response that exceeds the "Response Length" setting. Open up Silly Tavern UI. As the title suggests, I'm experiencing a problem with Gemini where I just can't for the life of me fix the repetition and response length of the bot. Temp: 0. There's currently an issue with response length on zen sliders defaulting to 16. To get longer responses, I've tried putting a note under paragraphs to keep response within say 3-4 paragraphs, doesn't work. But, I'm getting REALLY long responses, which I wouldn't mind, but the AI tends to ask lots of random questions or Your next response must be formatted as a numbered list of plain text entries. Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text This will have to be a 10/10 for writing, 9/10 for descriptiveness, and 2 extra points for cost and response length. This doesn't mean when silly tavern sends it doesn't see it. I can't RP anything because the AI has already taken the situation from A-Z in one response. This is the most stable and recommended branch, updated only when major releases are pushed. I usually stick with the preset "Carefree Kayra", AI Module set to Text The higher the response length, the longer it will take to generate the response. Click on the "Enable Instruct Mode" button (ON/OFF next to the name "Instruct Template"). Then, Mirostat mode 2, Eta 8 and 0. You will have 2895 tokens of 'memory' available for the Ai. 1 on the third one (forgot the name). NovelAI limits response length to ~150 tokens total, even if you set the slider higher than that. Thread necro but I'm running Pyg7b on SillyTavern and i just found that if you set "min length" in the master I'm using MythoLite with Mancer, and I've been happy with the results, but there's still one thing: I'd like to have a bit shorter responses from the AI. 1-mixtral-8x7b-v3, and is being run on koboldcpp by someone. I was also using Silly Tavern's own generation parameters settings, and like in Ooba, I set it to Novel AI - Pleasing Result preset, but I didn't set them simultaneously. Just a few problems with NovelAI Response Length Hey, I was curious if anyone had some useful advice to help me get the bots to respond with more than 2 sentences when they respond. It offers a unique experience by combining the power of advanced AI models with the allure of personalized virtual characters, allowing users You can tell from the response below what to expect from each model. I'm getting this error: Kobold returned error: 422 UNPROCESSABLE ENTITY {"detail":{"max_length":["Must be greater than or As third place and a Free option I would like to mention Poe integration: a wonderful person here did some changes to a variant of silly tavern and bring back Poe it's great but it's context size is really small , and the way he tell the Follow the basic rule: The lower something is located in the prompt, the more influential it is to the response. And what comes after char's line? User's. The Mirostat preset with Tau=5 works pretty well for me, but I'd imagine there's probably something else worth trying. You can set a maximum number of tokens for each response to control the response length. It was just weird that only Mancer was doing this and not any of the other models I've used in Silly Tavern, but if this helps alleviate that issue, then Mancer will quickly take the top spot for me as it's already had better responses in general, except for that one issue Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. I've tried some other APIs. 15 Top P - 0. It’s much more immersive when the length of the replies Also, in Silly Tavern I've only used the "Recovered Ruins" preset because I'm not sure the difference between the other options. 7 so I don’t know how long it’s been there, but they add so much new stuff all the time I usually update, but I do it by putting the whole folder in an “old ST” folder, putting the new folder where the old one was, and just copying my /public/characters and /public/chats folders into the new version, so If the new version is broken or They sometimes even exceed the response length I set up (366). If you're using one that should fit in your VRAM (7b and smaller models should, some quantisations of 13b too) then the culprit is likely that the context is being filled, and chonky character cards can do that pretty quickly. If supported by the API, you can enable Streaming to display the response bit by bit as it is being generated. Click on the "Capital A" tab in Silly Tavern UI (AI Response Formatting). 65, Repetition penalty: 1. Message actions : Click the Message actions button on a message for more message options like translation , image generation, and story branching. Pen. Additionally, Silly Tavern AI allows you to control the length of the AI-generated responses. Maybe just something like a recommended preset (Not context or instruct, but a model response preset). If that's more than the desired response length, it truncates the response to fit, but doesn't rethink what it was going to write. That was a problem for me too, as it was massive for me. Set Target Length in the Advanced Formatting Menu, again use ~160 tokens # Response (tokens) The maximum number of tokens that the API will generate to respond. 1, Repetition penalty range: 1024, Top P Sampling: 0. Max messages per request (raw modes only) - set to limit the maximum number of messages that will be Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. response from Claude-instant. ️x10. Response (3 paragraphs, engaging, natural, authentic, descriptive, creative): you should get 3 paragraph responses more often than not. Before anyone asks, my experimented settings areMax Response Length = 400Temperature=0. This can leave short responses or truncated lines if set too short. 🙂 Response length (ST) Technical Question So after what felt like eternity I got SillyTavern to work with poe, and the responses are amazing. How can you adjust the length of the responses. 9. 2 [uploading] Latest attempt at least censored Llama-3 by Jeiku: Aura_Uncensored_l3_8B My preset for Nyanade encourages shorter replies, after importing it, in the Instruct section, change "Response" from Target length: 200 Padding: 20 Generate only one line per request - checked Trim Incomplete Sentences - checked Include Newline - checked Response (tokens): 350 Context (tokens): 28160 (note: for me it slows down model with Limit Response length to 300. This is not straight line improvement, IMHO. I playing with switching the profiles for experiments pretty much, and prefer that they will not change my max context size and response length - these parameters tied to the model, not the style of generation. \n\nThe objective that you must make a numbered task list for is: [{{objective}}]. I'm new to using silly tavern and I need a api model that's good for roleplaying. I would say you increase the token length but leave your "target token length" short so that the AI knows it should wrap things up before the end. and suspiciously, it worked out to 297 tokens, so there's your Iirc ST using Novel has a theorical hard limit of 170 tokens, but you can use multigen to get longer responses, i put 150 tokens in both chunks fields and increase the tokens in Response Length to 300 accordingly, you must also disable Streaming since it conflicts with it. It's basically a tie with Goliath, but Goliath is better for erotics while Mixtral is better for SFW stuff. That's more of a model thing than anything else. These tokens will be filled up with your chat history. I've read that the option to increase max response length in NovelAI is tied to an option called multigen. 85Frequency Penalty=0. Unfortunately, this only helps early on, where "everything+Response Length" is less than the full context length. 8 which is The first day it started spouting out tons of 💦💦💦💦 emojis, which before that day I didn't know Silly Tavern could use emoji's at all. SillyTavern is being developed using a two-branch system to ensure a smooth experience for all users. Also, how do you add Henk717/airochronos-33B to the drop down list in the SillyTavern AI response configuration? I've added it on the SillyTavern API connections but it doesn't show up in the AI response configuration. I am just confused. I tried to write "Your responses are strictly limited to 100 words" in the system prompt, but it seems to ignore what I'm telling it to do. So it's still around that 110 or so max words per response. I'm using a 16k llama2-13b on a 4090. 9 Top A - 0 Top K - 11 Typical Sampling - 1 Tail Free Sampling - 0. If you want a bot to italicise actions, for example, you would just put asterisks around the bot's actions. Essentially, I'm using Sillytavern with Poe (Sage) and it all seemed to be working fine but then I found that the bot just writes out the entire scene from beginning to end. 10 votes, 14 comments. For me these parameters more useful as they are now - outside of the profile settings. Closed Enferlain opened this issue Apr 24, 2023 · 1 comment Closed Response length - 400 Context size - 2048 Temp - 0. Honestly, a lot of Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. Swipe the response: Click the Swipe button on the message to generate a different response. Skip to main content. Is there any way to change that, Ive heard using author's note can help, but not much else. 8Top P=1. Finally, set the number of response tokens to something close to the length you want - for you, that might be 500, or even more. I also have my max response length and target length set to 2000 tokens so that the agents have plenty of room to work. Actually from my testing, if you make the response tokens about 50-60 tokens higher than the "target length (tokens)" then that seems to cause much fewer broken sentences. If supported by the For example, if responses are getting truncated, can I just bump up the response length if I'm in the middle of a chat and regenerate the response? Or do I have to restart SillyTavern, or start did you try clicking on "AI Response Configuration" (the slider menu in the top left) and adjusting the slider "Response (tokens)"? Greater number of tokens should increase the output length. Most 7b models are kinda bad for RP from my testing, but If you do use this with Gemini Pro, Simple Proxy for Tavern context template seems to work well for me, with instruct mode turned off. For chatting, I recommend using the default depth of 1 since it's very flexible with other components of SillyTavern. This is particularly useful when you want to keep the conversation concise or when working within specific character limits, such as for Twitter or chat-based platforms. - 1. And as for the length of the answer, this is easily regulated again by the prompt itself and control by max response length in the How do you do that, like what settings are yours set in, honestly, I'm actually still not good at using silly tavern that I've been just using the default settings aside from using jailbreaks Response length is too high. You can lower that setting to make it easier to hit. It will also narrate my actions even though the jailbreak prompt I used says not to. Yeah the max character length on the website is about 600. Load up my Context Template (Story String) Preset from the Context Templates list. 4-Mixtral-Instruct-8x7b - response length, parroting, repetition, rambling . 9k context. Edit the message : Click the Edit button on any message to edit the message content . Open menu Open navigation Go to though maybe with the context length set to 32768 and the response length set to something higher than 250 tokens. When I use Silly Tavern and Ooba API, I The model is Noromaid-v0. Tavern is a user interface you can install on your computer (and Android phones) that allows you to interact text generation AIs and chat/roleplay with characters you or the community create. release-🌟 Recommended for most users. ExLlamav2_HF, gpu-spli 20,11, 32512k sequence length (32768 should be fine as well, but I like to keep a little overhead), 8bit cache, 4 number of experts per token. Pygmalion and Wizard-Vicuna based models do a great job of varying response lengths, sometimes approaching the token limit, and sometimes just offering quick 30 token replies. Remember LLMs don't think, don't understabd. Silly Tavern would send old messages from the same chat (upto context size. Response length is always inconsistent Help It doesn't seem to have any bearing on how high i have the temperature/context set to, whether I have a really lengthy starting message or how detailed my responses are, a lot of the time the models will just decide not to output anything more than a 40 words response. When it reaches the number of tokens in the slider or 150, whichever is lower, it will generate up to 20 more tokens, looking for a stop I didn’t start using SillyTavern until roughly 1. Expected behavior Using all 2048 tokens for context, Hey there, I assume that the max response length you set in ST is somewhere around 414 tokens, API response length (tokens) - allows to set an override API response length for generating summaries that are different from the globally set value. I'll check it out once I get home from work today. I did these tests with it set to 100 tokens. Load up my Instruct Template Preset from the Instruct Templates list. Ever since we lost Poe nothing has quite been the same, both alternatives for getting Poe working are a mixed bag and NovelAI seems to at least be the most consistent, barring the response length. 150 tokens desired response length. Shorter responses: Decrease response tokens to 200 & turn on trim. I'm not focusing on response length, but more on how much of info can go into the author's notes (token wise) before it gets unusable. (NOTE: proprietary models appear first, followed by self-hosted models starting from LLlama-13B) AI Models. When Streaming is off, responses will be displayed all at once when they are complete. but it doesn't seem to make any difference. 2 daily driver: Nyanade_Stunna-Maid-7B-v0. No like ill send a message and then sometime it will actually give me the 'silly tavern isn't responding do you want to wiat or end task' its not every time , Double-check the length of your currently active chat. I have read many quick "helpers" for different situations here on this sub, and I was wondering, how much of them could be placed into author's note at once before they lose their effect or even cause harm. However, this option seems to be absent from my advanced formatting options. You just need to provide the bot with examples of how you want it to write while labelling who ({{user}} or {{char}}) says what. Is there any way to shorten the response length? My pc uses AMD Ryzen 3700 gpu and AMD Radeon RX 5700, with 16 GB of RAM and 8 GB of VRAM. Silly Tavern is an innovative localized AI chat platform that allows users to create and chat with AI-generated characters. 0 Will change if I find better results. SillyTavern is a fork of TavernAI 1. The higher the response length, the longer it will take to generate the response. AI models can improve a lot when given guidance about the writing style you expect. They predict. I believe 50-100 tokens should give you your desired output length, you can mess around with it. 1 on Openrouter and even though the responses are fantastic, they are a bit short, averaging 100-200 tokens. Char's actions and dialog tipically take, around 100 to 150 tokens. But you have it set to 350, that means there are 300 extra tokens the LLM needs to fill in. Yodayos introducing a easier way to access tavern ai. Maybe try these (don't let the parameter sizes put you off): My usual Mistral-0. Longer responses: Increase response tokens & spam the continue button. How do I make the generated answers not to be so long? I have tried to put in the jailbreak and in the character note not to generate more than 3 paragraphs but it doesn't have good results. 2. The syntax I am using looks like: /trimtokens limit=60 direction=start | /continue. 9 u/reluctant_return u/IndependenceNo783 u/nzbiship u/yamilonewolf. I've already increased the Response Length bar. I'm guessing it's like that because it's designed for collaborating on writing stories vs just general text generation / I have been using silly tavern lately and it works pretty fine with most forms of solo rp. This can reduce Decrease the value of the Response Length setting; Give the character a phrase like 'short spoken', or 'doesn't talk much' line in their Description. But If you’re using openAI GPT as your api, click the top left menu, and change the Max Response Length (tokens). length of responses . I keep it at around 180. So vanilla ERP is on the menu! And, from what I've seen, it's not half bad at it either. The model I’m using is Silicon Maid 7b Q4 KS. Every time I grab the p-b token, it gives me invalid or expired token when using Silly Tavern. I'm used to longer responses like command R and other big models where they will write however long you want, usually close to token limit you set. 3 temp (adjust this if you want, sometimes I lower it), I leave Top P and K and that stuff on the default from the preset. 8 which is under more active development, and has added many major features. ok am going to assume you all just installed sillytavern and only know Decrease the value of the Response Length setting; Design a good First Message for the Character, which shows them speaking in a long-winded manner. Issues with prompting/instructing Noromaid-v0. woibstu hia ndhyrgn xopm chvfitg hlu bjuangg gfm catgi lcheu