![](https://programming.dev/pictrs/image/fbb0fc6c-aade-4537-ac38-36a7e6437814.jpeg)
![](https://fry.gs/pictrs/image/c6832070-8625-4688-b9e5-5d519541e092.png)
Google was working on a feature that would do just that, but I can’t recall the name of it.
They backed down for now due to public outcry, but I expect they’re just biding their time.
Google was working on a feature that would do just that, but I can’t recall the name of it.
They backed down for now due to public outcry, but I expect they’re just biding their time.
Not with this announcement, but it was.
It depends on the model you run. Mistral, Gemma, or Phi are great for a majority of devices, even with CPU or integrated graphics inference.
Show me a music store I can purchase music from on my phone through an app, and I’ll purchase it.
Pixel Experience is unfortunately dead now. 🙁
We all mess up! I hope that helps - let me know if you see improvements!
I think there was a special process to get Nvidia working in WSL. Let me check… (I’m running natively on Linux, so my experience doing it with WSL is limited.)
https://docs.nvidia.com/cuda/wsl-user-guide/index.html - I’m sure you’ve followed this already, but according to this, it looks like you don’t want to install the Nvidia drivers, and only want to install the cuda-toolkit metapackage. I’d follow the instructions from that link closely.
You may also run into performance issues within WSL due to the virtual machine overhead.
Good luck! I’m definitely willing to spend a few minutes offering advice/double checking some configuration settings if things go awry again. Let me know how things go. :-)
It should be split between VRAM and regular RAM, at least if it’s a GGUF model. Maybe it’s not, and that’s what’s wrong?
Ok, so using my “older” 2070 Super, I was able to get a response from a 70B parameter model in 9-12 minutes. (Llama 3 in this case.)
I’m fairly certain that you’re using your CPU or having another issue. Would you like to try and debug your configuration together?
No offense intended, but are you sure it’s using your GPU? Twenty minutes is about how long my CPU-locked instance takes to run some 70B parameter models.
On my RTX 3060, I generally get responses in seconds.
It’s a W3C managed standard, but there are tons of behavior not spelled out in the specification that platforms can choose to impose.
The standard doesn’t impose a 500 character limit, but there’s nothing that says there can’t be a limit.
Or maybe just let me focus on who I choose to follow? I’m not there for content discovery, though I know that’s why most people are.
I was reflecting on this myself the other day. For all my criticisms of Zuckerberg/Meta (which are very valid), they really didn’t have to release anything concerning LLaMA. They’re practically the only reason we have viable open source weights/models and an engine.
That’s the funny thing about UI/UX - sometimes changing non-functional colors can hurt things.
My go-to solution for this is the Android FolderSync app with an SFTP connection.
Correction: migrated to GitLab, but I don’t expect they’ll want to keep it there.
The Nuzu repository is already wiped.
With UI decisions like the shortcut bar, they really don’t. I switched to another SMS app because I couldn’t stand it.
Thank you! I was struggling to remember the proposal name.