Best settings for koboldai reddit. 0 Repetition Penalty: 1.
Best settings for koboldai reddit ) Mostly everything is off, except for those. I am kind of DONE with the current settings I have. I've been using KoboldAI Lite for the past week or so for various roleplays. " The information is counted seperately, so it's best to write it as a single sentence with commas instead ("John is a 27 old male, he works as a mechanic, etc). You can just delete the settings in your google drive if you want a full clear preset after messing around. 7 Disable all other samplers. 04-1. Welcome to the official subreddit of the PC Master Race / PCMR! All PC-related content is welcome, including build help, tech support, and any doubt one might have about PC ownership. I'm having great output when setting rep pen on 1. 01-1. 02 MinP: 0. KoboldAI is not an AI on its own, its a project where you can bring an AI model yourself. 03. My 2 question: What’s the best setting that you use? How do you get better outputs? Best of Reddit; Topics; Content Policy; Go to KoboldAI r/KoboldAI. My character says 1 sentence and the bot answers interpreting a character that dont Talk much into a full on speech every. Like in sillytavern there is custom chat separator, instruct mode, context formatting with tokenizer, token padding, a dedicated pygmalion formatting force option (which to this day I still haven't figured out what it does), and then multigen, different anchors and that's not counting all the sampling methods like p a k tail free and The subreddit for all things related to Modded Minecraft for Minecraft Java Edition --- This subreddit was originally created for discussion around the FTB launcher and its modpacks but has since grown to encompass all aspects of modding the Java edition of Minecraft. The "key" to writing WI using this info is that dots (. IF you write "John is 27. 1 Rep pen range 1024 Rep pen slope 0. If Pyg6b works, I’d also recommend looking at Wizards Uncensored 13b, the-bloke has ggml versions on Huggingface. You have to rename the extensions of the other models that you are not using so KoboldAI will not be confused. 5 Max Temp: 4. Thats at 2K context, if you wish to go up to 4K it might be possible but then you need to adjust the setting in the nvidia control panel that says CUDA - System Fallback Policy. Left AID and KoboldAI is quickly killin' it, I love it. Temperature 0. 9 and 0. And the AI's people can typically run at home are very small by comparison because it is expensive to both use and train larger models. If anyone has any additional recomendations for SillyTavern settings to change let me know but I'm assuming I should probably ask over on their subreddit instead of here. With these settings I barely have any repetition with another model. Your setup allows you to try the larger models, like 13B Nerys, as you can split layers to RAM. It was up there with free AID, then I messed with the settings (big mistake) and the output started becoming very bad 😬 I went back and tried the default settings (Only sampling was top P - 0. ) end a sentence. Though, just mess around with the settings and try it out for yourself. New Collab J-6B model rocks my socks off and is on-par with AID, the multiple-responses thing makes it 10x better. I can't think of any setting that made a difference at all. What are the best settings for Mixtral? Should I leave VRAM for context, or should I offload as many layers as possible? I am using Q5 KM quant and these hardware: 32GB DDR4 3400 RX 7600 XT 16GB R 5 5600 First things first, try Mirostat sampling if your model suports it (most do) and you're not using it already. John is a mechanic. 9 Rep pen 1. I've tired multiple settings, lowering max tokens etc, and the respond is just not good. I only use kobold when running 4bit models locally on my aging pc. 05 are borderline so I really don't use those. Aug 17, 2024 · Per a recommendation from the developer of the DRY thing and a few other people. Not personally. 9 top_p and 1. I've been having good results using models based on Chronos or Hermes, and the model I'm using Mythologic L2, seems pretty good too. Temp: 0. You can run any AI model (up to 20B size) that can generate text from the Huggingface website. SillyTavern supports Dynamic Temperature now and I suggest to try that. Posted by u/AccomplishedCress875 - 16 votes and 3 comments. com The defaults are decent. 1. 5 temp. This means software you are free to modify and distribute, such as applications licensed under the GNU General Public License, BSD license, MIT license, Apache license, etc. 11 Rep Penalty, 1024 Repetition Penalty Range Tokens, 322 Amount generation Tokens, and 1394 Context Size Tokens Best of Reddit The following settings alter the output of the model from the default (compute the token probability distribution, choose according to the distribution, this is with all samplers off, temperature = 1, and rep penalty = 1). But are there any settings which would fit my 1080ti 11GB GPU? use the following search parameters to narrow your results: subreddit:subreddit find submissions in "subreddit" author:username find submissions by "username" site:example. A community for sharing and promoting free/libre and open-source software (freedomware) on the Android platform. 2 repetition penalty. It isn't the best nor is it token light, but it's very accessible. Q: Does KoboldAI have custom models support? A: Yes, it does. Min. I'm using KoboldAI instead of the horde, so your results may vary. Generally a higher B number means the LLM was trained on more data and will be more coherent and better able to follow a conversation, but it's also slower and/or needs more a expensive computer to run it quickly. Ngl it’s mostly for nsfw and other chatbot things, I have a 3060 with 12gb of vram, 32gb of ram, and a Ryzen 7 5800X, I’m hoping for speeds of around 10-15sec with using tavern and koboldcpp. While generally it's been fantastic, two things keep cropping up that are starting to annoy me. 0 Repetition Penalty: 1. To do this, on the page of the selected model, click on the "Copy model name to clipboard" square icon next to the model name highlighted in bold. Are there multiple model files in the same folder? Sometimes git repos contain more than one model quantization and KoboldAI might be reading the wrong file. It's been a while so I imagine you've already found the answer, but the 'B' version is related to how big the LLM is. With the settings: 0. KoboldAI won't read the model if the groupsize (128g) or filename is not correct. , and software that isn’t designed to restrict you in any way. 95 temp, 1. Pyg 6b was great, I ran it through koboldcpp and then SillyTavern so I could make my characters how I wanted (there’s also a good Pyg 6b preset in silly taverns settings). As the other guy said, try using another model. Oct 9, 2024 · I'm very new to this and already played around with Kobold pp, so far so good. My only experience with models that large, was that I could barely fit 16 layers on my 3060 card, and had to split the rest on them in normal RAM (about 19 GB), which resulted in about 110 seconds / generation (the default output tokens). So I think that repetition is mostly a parameter settings issue. 7 Top-p 0. It's hard to really say what's "best" and I'm too lazy to do an actual comparison but it seems to work well and is a nice and modern way to do it. The text its like 4 full word Pages long when I am trying to have a simple conversation. For the API settings I see JanitorLLM Beta, Open AI, Kobold AI so what’s the difference between those three and another question is the Generation settings for the temperature and max new tokens so what does that do. It means most of the sliders don't need to be touched, and once you've got it how you like you probably won't need to adjust it. r/KoboldAI Best Settings for RTX 2080 and Ryzen 7 5700x? Yup, that one is mine and it includes the settings i use to try and get a better experience for 6B than having it on the games defaults. Maybe up the temperature a little. 1 Everything else at off/default. 13B Q4_K_S is what you can fully offload. The settings it currently ships with are as follows : 0. They are the best of the best AI models currently available. The main hurdle is that there are too many settings. I personally feel like KoboldAI has the worst frontend, so I don’t even use it when I’m using KoboldAI to run a model. However, I fine tune and fine tune my settings and it's hard for me to find a happy medium. Last question, if anyone knows how to save the settings so I don't need to input them every time I will be very greatful. I use SillyTavern as my front end 99% of the time, and have pretty much switched to text-generation-webui for running models. 5 temp, 0.