Models like ChatGPT are heavily censored to prevent people from being able to generate anything that OpenAI feels is objectionable. A wonderful alternative is to run your own language models, which are both more private and more flexible in what can be generated.
For the best quality, you can spin up an A100 instance on runpod.io and run a model locally there. It'll cost between $1-2 an hour, but works very well. Personally, I'm using the lzlv_70b model at the moment (https://huggingface.co/TheBloke/lzlv_70B-GPTQ) which works great and is uncensored (i.e. erotic roleplay is something it can do quite well).
I'd agree for the most part with what you say however, there are "ways" still to leverage ChatGPT and Claude. Using Sillytavern as front end works well and you can plug in your OAI api key. ST's ability to let you add system prompts and jailbreak help massively. Claude's high context, GPT 3.5 16k and GPT4 8k are still miles ahead of the local models. For now...
I think the future is definitely local, some of the 70b models I've been testing are showing alot of promise but still fall short for now. The lower context is frustrating and of course hardware demands. Mistrel 7b varients are performing way above their weight and are comparable to 16b's. I'm looking forwrd to seeing what their larger models will look like.
ABGamma said: Models like ChatGPT are heavily censored to prevent people from being able to generate anything that OpenAI feels is objectionable. A wonderful alternative is to run your own language models, which are both more private and more flexible in what can be generated.
For the best quality, you can spin up an A100 instance on runpod.io and run a model locally there. It'll cost between $1-2 an hour, but works very well. Personally, I'm using the lzlv_70b model at the moment (https://huggingface.co/TheBloke/lzlv_70B-GPTQ) which works great and is uncensored (i.e. erotic roleplay is something it can do quite well).
Yep, anything on Huggingface is free. TheBloke creates what are known as "quantized" versions of models, which are substantially compressed without dramatically affecting quality, allowing them to be fit on consumer GPUs and generate text more quickly.
In fact, using text-gen-webui, on the model tab, there is a feature to download models directly from Huggingface.
While the model is free, you need some very beefy hardware to run it locally (think dual RTX 3090 setups). The cost I mentioned was to rent high power GPU hardware from a cloud provider.