Best llm reddit. Llama 2 Chat 70b is the best quality open source model IMO.
Best llm reddit I’m looking for a tool that I can use for writing stories preferably uncensored. It seems impracticall running LLM constantly or spinning it off when I need some answer quickly. Especially in The Bloke's model list, fine-tunes of Mistral-7B are very prominent, also merges of different fine-tuned Mistal-7B derived models. I'm writing a spicy video game and these were the models I tried in reverse order. 3, WizardLM 1. I'm hoping you might entertain a random question - I understand that 8B and 11B are the model parameter size, and since you ordered them in a specific way, I'm assuming that the 4x8 and 8x7 are both bigger than the 11b, and that the Welcome to Reddit's own amateur (ham) radio club. Be professional, humble, and open to new ideas. the LLM. Keeping that in mind, you can fully load a Q_4_M 34B model like synthia-34b-v1. It seems that most people are using ChatGPT and GPT-4. As far as LLMs go. If you want to run via cpu or Nvidia gpu with Cuda, that works already today with good documentation too. Some are great for creative writing, while others are better suited for research or code generation. Try out a couple with LMStudio (gguf best for cpu only) if you need RAG GPT4ALL with sBert plugin is okay. Phind is good for a search engine/code engine. I just wish they'd keep updating faster and prioritize popular models. Just thinking purely for just tweaking and experimenting with running multiple small models at once on either my dedicated PC (3070ti, yada yada not the best for it Ik) vs multiple raspberry Pi's talking to each other. This thread should be pinned or reposted once a week, or something. Anything below 48GB is going to require increasing compromise. To people reading this thread: DO NOT DOWNVOTE just because the OP mentioned or used an LLM to ask a mathematical question. Whereas some LLMs just rely on their RLHF. The final summary length scales with the number and complexity of Here is what I did: On linux, ran a ddns client with a free service (), then I have a domain name pointing at my local hardware. There are no open source models on the level of GPT-4 or Claude. My leaderboard has two interviews: junior-v2 and senior. I'm really interested in how it stacks up against Guanaco because I tried both and found Guanaco to be better in my evaluation, but given how popular Airoboros seems to be, I'd like to see how it places on the leaderboard for a little more Update: Last week we saw that LemonadeRP-7B was the best role-play LLM. Tool 4: Tabnine is recognized as the best LLM for coding. senior is a much tougher test that few models can pass, but I just started working on it Even for more conceptual questions that don't require calculation, LLMs can lead you astray; they can also give you good ideas to investigate further, but you should never trust what an LLM tells you. We welcome anyone curious to learn more about unified connectivity and an API-led approach. 5-72B, my coding questions are usually more abstract, i. AwanLLM (Awan LLM) (huggingface. This sub is monitored by MuleSoft professionals who's opinions are theirs alone and do not represent the beliefs of the company as a whole. For them, even the best LLM wouldn't be an effective therapist (assuming the dodo bird verdict is true). Each LLM is unique in itself, GPT4 is good for analytics and insights, Claude 3 is amazing for creative writing, GeminiPro obviously has much more context due to Google. Mac can run LLM's but you'll never get good speeds compared to Nvidia as almost all of the AI tools are build upon CUDA and it will always run best on these. Im not an LLM specialist, but below are in my queue to learn LLM. More importantly however, the behavior of reddit leadership in The LLM is definitely not a requirement to work at one of the big accounting firms, but it will make getting hired easier. GPT-4 is the best instruction tuned LLM available. for the server, early, we just used oobabooga and the api & openai extensions. However, this generation 30B models are just not good. I have a 3090 and plenty of space on my SSD. You might want to fine-tune GPT-4 and the use it to generate training data for your local LLM. 60GHz, 6 cores, 12 threads) with an NVIDIA GeForce RTX 2070 Super (8 GB VRAM), and 64 GB RAM (Crucial DDR4 3200MHz CL22). "Llama Chat" is one example. But I thought it would be cool to provide GPT4 like features - chat, photo understanding, image generation, whisper and an easy-to-use simple UI all in one, and for free (or a very low price). a llama3-8B quant down to ~4bit With external filters, I think they have another LLM behind the scenes to read the outputs, and if they notice anything's off then bam, gone. 175K subscribers in the LocalLLaMA community. and any relevant context window/cache. At the end I create a summary-of-summaries and get my final result. looks like the are sending folks over to the can-ai-code leaderboard which I maintain 😉 . That's exactly one of the popular models I wish they'd prioritize. 12 per 1k tokens is about 1000 times cheaper than a human lawyer. There isn't a single best LLM as they all have their strengths! It really depends on what you're looking for. You can share your latest projects, ask for feedback, seek advice on best practices, and participate in discussions on emerging trends and technologies. 6-mistral-7b is amazingly good at narrative content that's slightly spicy. If you are into it and looking for applied books than here is my recommendation: Natural Language Processing with Transformers, Revised Edition - Lewis Tunstall, Leandro von Werra, Thomas Wolf . In my base, no LLM obtained more than 40%, the best performance was o1. I saw mentioned that a P40 would be a cheap option to get a lot of vram. Which is the best offline LLM in your opinion (based on your experience) for translating texts? On the one hand, many people believe LLMs are infallible and they'd find the advice authoritative. I didn't see any posts talking about or comparing how different type/size of LLM influences the performance of the whole RAG system. i think the ooba api is better at some things, the openai compatible api is handy for others. As you know Langchain, I'll just skip what I know in Langchain. I believe langchain is more suited for demo llm applications but not for prod use case because of the abstraction and poor documentation when you need to debug or tweaks some use cases, so if you do it yourself i think is the best approach rn sine everything is changing so quickly and you dont want to update every 2 weeks It seems that Llama 3 (and Mistral too) has some language translation functions, which can be compared to Google Translate. 4 (we need more benchmarks between the three!). After going through many benchmarks, and my own very informal testing I've narrowed down my favorite LLaMA models to Vicuna 1. CommandR+ is the best if your story has explicit scenes and it is fairly close to Sonnet for SFW stuff. Subreddit to discuss about Llama, the large language model created by Meta AI. what tech to use, how to get the MVP off the ground, basically a mix between code and prose, I don't really use LLMs for code completion that much, but more for learning stuff and understanding best practices to achieve something, and for this usage a larger model has The Bloke https://huggingface. Yes I've tried Samantha the editor, and my results with it were very very poor compared to whatever else I've tried. I regularly use my 34B merge for continuing long stories. As a bonus, Linux by itself easily gives you something like 10-30% performance boost for LLMs, and on top of that, running headless Linux completely frees up the entire VRAM so you can have it all for your LLM in its entirety, which is impossible in Windows because Windows itself reserves part of the VRAM just to render the desktop. turn on the light in my room, turn up the temperature if it's too cool, etc. ) Recently I did a project on a Voice Conversational chatbot, where I want to have a good conversation like Apple's Siri. Idk how correct this is, but once you get one 'bad' theme through and Claude makes an output it should be easier to continue with that theme. The test consists of three sections: Verbal Ability and Reading Comprehension (VARC), Data Interpretation and Logical TLDR: Sonnet 3. No LLM model is particularly good at fiction. Tool 4: Tabnine. I'm looking for the best uncensored local LLMs for creative story writing. 5 for free and 4 for 20usd/month? My story: For day to day questions I use ChatGPT 4. I think the problem is that, in general, technology hasn't been the best at foreign language translations. But I I used to spend a lot of time digging through each LLM on the HuggingFace Leaderboard. gguf into memory without any tricks. that has to stuff in all of the game itself. true. If you are wondering what Amateur Radio is about, it's basically a two way radio service where licensed operators throughout the world experiment and communicate with each other on frequencies reserved for license holders. Those don't offer the best performance, but would be a lot faster than CPU inference, and 32GB of VRAM would accommodate large'ish models. i have. Thank you so much for your help! Share Add a Comment I'm still on a laptop (Intel Core i7-10750H CPU @ 2. 5 is pretty strong and very cheap, while GPT-4 32k at $0. Mlc-llm has only recently added rocm support for amd, so the docs are lacking. There are other websites that host their own llm but most go through a similar pattern of realising how much hosting costs and then getting a sponsor who ends up censoring the llm eventually. I have a test question for you, which I suspect you'll fail. We would like to show you a description here but the site won’t allow us. With 37% this happens, at least in my tests, as the need for context increases or the language becomes more vague and nebulous. 2GB of vram usage (with a bunch of stuff open in I've tested a lot of models, for different things a lot of times different base models but trained on same datasets, other times using opus, gpt4o, and Gemini pro as judges, or just using chat arena to compare stuff. However, I have seen interesting tests with Starcoder. If you are looking for free courses about AI, LLMs, CV, or NLP, I created the repository with links to resources that I found super high quality and helpful. I'm mostly looking for ones that can write good dialogue and descriptions for fictional stories. The long story. EDIT: I have updated the questions a lot and this now has its own leaderboard on hugging face with more than only 7b models. Dolphin-2. co) Free Tier: 10 requests per minute Access to all 8B models Me and my friends spun up a new LLM API provider service that has a free tier that is basically unlimited for personal use. Even when the summary wasn't from the LLM, but it was yours ;) I was pretty happy with the results of co-writing a story with the LLM. This. Beyond that, look at the Yi 200K finetunes. I thought Granite was disappointing. it’s also these devs that are working on expanding context sizes, and building things like ChromaDB (creates a database of previous chats Yeah I really like Qwen2. The Common Admission Test (CAT) is a computer based test (CBT) for admission in a graduate management program. (It seems to be an edge case like similar questions in the past, which LLM's don't deal so well with. reddit. What is the best current uncensored Storytelling LLM that can run with 32gb system ram and 8 gb Vram PC? My current rule of thumb on base models is, sub-70b, mistral 7b is the winner from here on out until llama-3 or other new models, 70b llama-2 is better than mistral 7b, stablelm 3b is Access the latest LLM leaderboard with comprehensive performance metrics and benchmark data. It’s roughly as good as GPT 3. (I'll include the full convo. with a 4090, you have 24GB VRAM. code only). Even three months after its release, it continues to be the optimal choice for 24GB GPUs. Thanks for posting these. GPT4-X-Vicuna-13B q4_0 and you could maybe offload like 10 layers (40 is whole model) to the GPU using the -ngl argument in llama. Is it the same manual typing in CSV file row by row, or I am just thinking what was happening 20years ago and we have some automated tool for that now. This is a complete guide to start and improve your LLM skills in 2023 without an advanced background in the field and stay up-to-date with the latest news and state-of-the-art techniques! r/MuleSoft is the official reddit gathering place for all things MuleSoft. super small models designed/able to run on very low end hardware. I have created an LLM model quality and price comparison that took me several hours. How actually is the dataset for llm models? So besides GPT4, I have found Codeium to be the best imo. So the LLM knows which lights are on, temperatures, etc. Hey fellow LLM enthusiasts, I've been experimenting with various models that fit within 16GB VRAM for coding chat and autocomplete tasks. Our rankings have been updated and now it's Noromaid-Mixtral is number 1. Forget about LLM work - work should replace your shitty Intel Mac with ANY M-series Mac and that alone would be a massive boost for you. 2. Learn which AI tools work best and why Claude 3. Depending on your preferred market and practice area, the JD might be all you need to get hired. But I The best models I have tried out of these is the Gemma2, the 9B model for a faster model with larger context length and the 27B model for a more accurate response but low context length Qwen2. While in some areas Claude 3 outperforms GPT4, in others GPT4 fairs much better than Claude or Gemini. There’s a bit of “it depends” in the answer, but as of a few days ago, I’m using gpt-x-llama-30b for most thjngs. I personally use 2 x 3090 but 40 series cards are very good too. I guess I expected more from IBM If you want the best performance for your LLM then stay away from using Mac and rather build a PC with Nvidia cards. 0 (and it's uncensored variants), and Airoboros 1. The LLM veterans might guess correctly which family of LLMs did the best at following the rules - not much surprise there. 5 Sonnet and OpenAI's models stand out. 5 for writing tasks. . 5 Pro, or Claude 3. Running that with a q6_K quant would probably give the best results and performance for your setup - q5_K_M if you need more context. There are some special purpose models (i. Slot it into oobabooga and connect ST to it I can suggest few books which I read and found useful. It is pretty decent. I am estimating this for each language by reviewing LLM code benchmark results, public LLM dataset compositions, available GitHub and Stack Overflow data, and anecdotes from developers on Reddit. co/TheBloke/ quantizes other people's best-of-breed and/or interesting models and publishes them on Huggingface, which makes for a more condensed feed of the "best" models. Your personal setups: What laptops or desktops are you using for coding, testing, and general LLM work? Have you found any particular hardware configurations (CPU, RAM, GPU) that work best? Server setups: What hardware do you use for training models? Are you using cloud solutions, on-premises servers, or a combination of both? We are Reddit's primary hub for all things modding, from troubleshooting for beginners to creation of mods by experts. I need something lightweight that can run on my machine, so maybe 3B, 7B or 13B. There are gimmicks like slightly longer context windows (but low performance if you actually try to use the whole window, see the "Lost in the Middle" paper) and unrestricted models. "Best" also depends on your RAM. But still, I have to figure out a way to deal with the latency. Local LLM's are great depending on your PC. The best place on Reddit for admissions advice. On the other, some people are already frustrated with stuff like LLMs operating IT help chats and put a premium on talking to a 'real' person. Claude (Anthropic) Features & Tools: Claude is an AI assistant known for its very large context window and thoughtful responses. For this leaderboard, an LLM is graded for each question based on its willingness to answer, how well it follows instructions, and Pi gave me this response. Literally night and day - intel doesn’t hold a candle in performance (total performance AND performance per watt). Those claiming otherwise have low expectations. cpp? I tried running this on my machine (which, admittedly has a 12700K and 3080 Ti) with 10 layers offloaded and only 2 threads to try and get something similar-ish to your setup, and it peaked at 4. Google Translate is SOTA in that realm, and it's not perfect. I'm not sure I'd trust it for doing this in a real production sense, but I do trust it enough to I’m new to LLM’s and feel overwhelmed with all the options out there. then on my router i forwarded the ports i needed (ssh/api ports). Compare top language models with interactive analysis tools. By this I'm talking things like llama 2 / mixtral-moe etc. Whether you are a seasoned LLM and NLP developer or just getting started in the field, this Subreddit is the perfect place for you to learn, connect, and collaborate with like-minded individuals. I'm wondering if there are any recommended local LLM capable of achieving RAG. Any other recommendations? There's the BigCode leaderboard but seems it stopped being updated in November. Another approach is to maintain a database of canned query/reply pairs, and use RAG with a smaller, faster model when a query closely matches one from the database. A community of individuals who seek to solve problems, network professionally, collaborate on projects, and make the world a better place. What do you think is the best model for me? Thanks in advance! We would like to show you a description here but the site won’t allow us. Read the huggingface book, then I would suggest, take any famous LLM model (smaller one from huggingface) pass some input , attach debugger and try to understand the information flow and the major building blocks. I’m seeing detailed discussions on discord and Reddit and in DMs about the quality of replies, the ability of the models to keep up with multiple characters, awareness status of various characters, etc. For example, there's a project called HELF AI that caught my eye recently. 2. Check out the sidebar for intro guides. It enhances coding efficiency by offering intelligent code completions, which Top LLMs for coding in 2025, ranked based on accuracy, integration, cost, and speed. e. I would suggest loading up the game and seeing what kind of VRAM you still have left - this will depend on all sorts of things like texture quality, mods you have installed, etc etc. com. Just compare a good human written story with the LLM output. What's your favourite local LLM at the Improved large language models (LLMs) emerge frequently, and while cloud-based solutions offer convenience, running LLMs locally provides several advantages, including enhanced privacy, offline accessibility, and Credit: www. The best would be if I can ask it to write an article (like you can with Jasper or ChatGPT) but have it properly cite research papers when possible. Main takeaways are: Top 5 models: Use GPT 4o, Gemini 1. They are so well known already but just in case you don't know him or any other member in this subreddit are trying to look for resources to learn or get to know LLM, here are my 2 cents. ) Me: Hi Pi. I'm a beginner, just started reading articles about prompts and With the newest drivers on Windows you can not use more than 19-something Gb of VRAM, or everything would just freeze. 5 is the best for writing if you dont have explicit content in your stories. So for now it is my favorite story writing model. I was motivated to look into this because many folks have been claiming that their Large Language Model (LLM) is the best at coding. – and I can just tell it to e. Reply reply A lot of discussions which model is the best, but I keep asking myself, why would average person need expensive setup to run LLM locally when you can get ChatGPT 3. 27 votes, 26 comments. Knowledge for 13b model is mindblowing he posses knowledge about almost any question you asked but he likes to talk about drug and alcohol abuse. The problem is Ollama doesnt allow any other roles other than user, system and assistant while Nous's Hermes 2 model was finetuned on another role called <tool>. Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play - David Foster, Karl Friston Hi, There are already quite a few apps running large models on mobile phones, such as LLMFarm, Private LLM, DrawThings, and etc. Build a platform around the GPU(s) By platform I mean motherboard+CPU+RAM as these are pretty tightly coupled. The human one, when written by a skilled author, feels like the characters are alive and has them do stuff that feels to the reader, unpredictable yet inevitable once you've read the story. Rumour has it llama3 is a week or so away, but I’m doubtful it will beat commandR+ Reply reply More replies More replies More replies If a company has to train a llm model on a dataset on their own set of data I have two questions How is the dataset made. We ask that you please take a minute to read through the rules and check out the resources provided before creating a post, especially if you are new here. I have a 96 gb M3 so hopefully not limited by RAM. 5 32B remains my primary local LLM. Mac is simply the best easiest thing to use period. Obviously need a LLM capable of dealing with about 10 or so pages of PDF. Q4_K_M. g. " The best solution I have found is to chunk the transcript up, and provide some overlap to the model from the prior summary. tiefighter 13B is freaking amazing,model is really fine tuned for general chat and highly detailed narative. Knowledge Graphs map relations between entities. On the one hand it can summarize your text - but it can also introduce it back to the LLM to give it context. What coding llm is the best today? CSCareerQuestions protests in solidarity with the developers who make third party reddit apps. 5 Sonnet, but not GPT 4 Turbo nor GPT 4. Comparing parameters, checking out the supported languages, figuring out the underlying architecture, and understanding the tokenizer Meaning if I ask the AI, “Who is John’s best friend?” it will search its RAG database for "John"-related entries, but it won't necessarily lead me to the answer of who John's best friend is. "Les fleurs peignent le printemps, la renaissance de la nature apporte la joie et la beauté emplit l'air. However, there were a few other models that did quite good (or unexpectedly bad). This model hits way above its weight for a 7B and I use it often. The home-llm integration puts the smart home's state into the prompt (that's the RAG part) and provides functions to call to change that state (that's the tools part). A modern LLM like Mistal7B is made of 32 layers of 4096x4096 transformer nodes In pretraining, Coherent data is fed thru the network one word at a time (in this case the entire internets text) and the models node-connection-weights are automatically adjusted towards the values such that given a list of words it correctly predicts the next one Anyone working on LLM Agent systems? What open source projects are you using? What works well, what doesn't? Searching for something that will allow me to specify system prompts for classes of Agents ('Manager', 'Programmer', Um, probably Opus 70B (or any of the other sizes) is best at 4K context. Actually I hope that one day a LLM (or multiple LLMs) can manage the server, like setting up docker containers troubleshoot issues and inform users on how to use the services. Both free and paid Claude operate using 99 votes, 65 comments. If a model doesn't get at least 90% on junior it's useless for coding. Knowledge about drugs super dark stuff is even disturbed like you are talking with somene working in drug store or We would like to show you a description here but the site won’t allow us. Honorable Apparently, what matters is not the size of LLM but rather how well it was trained, and this can get pricey. reddit's new API changes kill third party apps that offer accessibility features, mod tools, and other features not found in the first party app. Llama 2 Chat 70b is the best quality open source model IMO. For the french translation, Claude v2 is the best, as using "colorent" is better than the literal "peignent" which isn't pleasing to the ear. At other hand, GPT-3. Used RTX 30 series is the best price to performance, and I'd recommend the 3060 12GB (~$300), RTX A4000 16GB (~$500), RTX 3090 24GB (~$700-800). Post any questions you have, there Trying to use crew ai and some pdf extractor tools to get a LLM to read a medical study and conduct statical analysis of the data in the study to ensure that the conclusion is justified by the data. Also: use the Oobabooga extension "Playground" as it has an easy to use "summary" feature. Therefore I have been looking at hardware upgrades and opinions on reddit. oevm rjram vslk cbfvyk pkjvguqew rilde hcy ocw qnwuz ayobiebf greel tkz hlez dadh gfgzag