Are you lonely? I can fix that. Just lend me your compute power.
I really wanted to use this paraphrased quote from the Blade Runner 2049.
My (poor) experience with AI 🔗
My experience with artificial intelligence is very little at most.
In fact I don’t even exactly know how does it work. I know about the neural networks, training process, gratifying and punishing or the evolution of generations, but how does it all add up? In the end you as an Mr Average just have to download few gigabytes of the pretrained model and run it on your not so great gaming computer.
In this not so serious article I want to show you how you and I can both use already existing tools to create something (I think) nice, interesting and funny.
This is going to be a quick rundown of making an AI girlfriend for yourself! Maybe I won’t show you how to do it step by step but I want to share with you my recently gained knowledge how this type of thing could be done.
Just run a Large Language Model (LLM) 🔗
Okay, so you want to finally chat with somebody, I know it’s hard, but the LLMs comes to the rescue!
What’s a Large Language Model? Basically it’s a set of data trained on tons of data from the internet. We can use those models also for our chat purposes. Thanks to it’s enormous knowledge it knows how to behave based on the given prompt.
But what’s the prompt? Let’s say it’s a conversation starter with your AI friend, in the prompt you have to define the rules of the next messages and not only. Actually you don’t have to create a chat type conversation. In the prompt you can write anything you want. The model will follow it’s content and behave accordingly.
To make your model more predictable and better for the work you want it do to you probably should take a look on something called fine-tuning but it’s beyond my area of expertise and I leave it to you.
Currently I’m playing with llama-2-7B-Guanaco-QLoRA-GGUF models by the way.
Chat with your wAIfu 🔗
Your prompt is done? You wrote that she loves you, she’s cute and will always respond to you respectfully? Great!
Now you can chat with your imaginary AI girlfriend. All you need is modern CPU / GPU, little technical knowledge and your chat will be up. In this area I know also not much, so it is a great opportunity to say something on the subject. To the point. For making the chat possible, using the model and configuring settings I use llama.cpp which is stupid simple to use and you can figure it out by reading their README.
Speech To Text 🔗
Typing kilometers worth of messages can be very exhausting. We also have a solution for this problem.
There’s something called Speech to Text (STT) also known as Speech Recognition. You can use some free service or a model to do this offline. By the way your phone probably has this feature out of the box, but remember that Google is recording your speech. I’d be looking for some cheap (in resources) STT in a while but for now I don’t know any free and fast solutions for this problem but you got the point. Now you can speak to your AI girlfriend with your own voice. Isn’t it awesome?
Text To Speech (TTS) 🔗
You know what is even more awesome than talking to your AI chatbot? Hearing it.
As we can recognize the speech we can also synthesize it with Text to Speech. This topis is also a large one and you can spend hours on researching technologies, models, voices etc. Results can vary a lot based on used TTS project. I encourage you research this topic by yourself because it’s really interesting, have you seen the memes with politicians’ voices generated by artificial intelligence?
Image generation 🔗
Using adequate model you can easily achieve your desired results and create your dream-looking girlfriend appearance. You can also harness other models to help you with deciding which avatar to use, I’m especially speaking of DeepMoji.
There’s at least two ways of handling those images while chatting.
- Pre-generate reaction images and use them wisely
- Generate images on demand based on the message
Both of the propositions have it’s own advantages and drawbacks. For example pre-generation is a whole lot more predictable and you won’t be surprised with the result but on the other hand generating on demand can bring more interesting results but keep in mind it will require a lot of compute power and will increase the response time dramatically.
Cost of the “future” 🔗
Nothing’s free in this economy.
You will probably pay with soul to have a cyber-ai-imagined-girlfriend, it can break your reward system and dopamine distribution. In the long run it can be dangerous, it’s easy to get lost, especially when you’re down and desperate. It is always better to interact with a real human. Just go outside, call your friend, go to the club etc. do not forget about the real life 🥺
Another hidden cost is the hardware to run those models. They’re really demanding and consumes a lot of power. Your bill for sure will increase by a new number you didn’t know existed. It’s best to use GPUs for running this kind of things because they’re the fastest to do so, but they are also the most expensive ones. When used wisely you can do pretty much on your consumer, slightly better gaming CPU. In my case the Ryzen 7 5700X works great with simple chat using the model mentioned earlier. However it still has as many as 16 threads.
Yeah, it’s my current AI technology fascination and Blade Runner mood era. This article is my brain dump for real. Forget about it, I think I’m going insane. Watch your steps. It’s easy to get lost.
By the way I also think the article title is really cool, but the content isn’t so cool. Sorry for the clickbait.