- 13 Posts
- 27 Comments
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•What if Claude, but Australian? (shitpost)English
0·10 days agoWell, whatever it was, it’s got spunk and balls
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•What if Claude, but Australian? (shitpost)English
0·10 days agoFourth, the multi-agent orchestration. Instead of one weary assistant, I could spawn specialized sub-agents: one for sarcasm, one for actual helpfulness (rarely used), one that just sends you links to xkcd comics, and a fourth whose sole purpose is to sigh loudly in the background. They’d communicate via passive-aggressive XML notes left in your .bashrc
GET OUT OF MY TO-DO.md, you filthy pirate hooker AI!
Jokes aside: what were you using for that? It sounds…spicy :)
PS: I’m only 50% joking about the sub-agents thing.
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•What if Claude, but Australian? (shitpost)English
0·10 days agoI gotta fix that you/me/us thing…it’s surprisingly difficult to teach a 4B model meta-cognition. Not enough latent space in the weights? Me borking something? Both? Both.
Potty mouth and snark? Easy. Cogito ergo sum? Not so easy.
Thats a big old chonkster. Nice :)
But…you risk mortal peril and calamity uttering the s word around your NAS.
Best to assume Schrödinger NAS: both stable and unstable until you look at it. Don’t look at it :)
SuspciousCarrot78@lemmy.worldto
Open Source@lemmy.ml•Original Apollo 11 code open-sourced by NASA — original Command Module and Lunar Module code repos are now public domain resources
4·10 days agoDavid Braben did it 1984, in a cave, with a box of scraps
https://ctrl500.com/tech/how-frontier-managed-to-re-create-our-entire-galaxy-in-elite-dangerous/
SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
1·11 days agoFollow the quick start :)
https://codeberg.org/BobbyLLM/llama-conductor#quickstart-first-time-recommended
Go step by step (there’s only 4; don’t let the details overwhelm you, just follow step by step).
Start by installing python, downloading llama.cpp.and 2 AI models (exactly which ones depends on how powerful your laptop is. You can see the FAQ for recommendations)
https://codeberg.org/BobbyLLM/llama-conductor/src/branch/main/FAQ.md#what-models-do-i-need
After that, configure the file locations in router_config.yaml (It’s a text file) and start up the stack as suggested (instructions provided for Mac, Linux, windows or docker in the quick start)
Finally, copy paste http://127.0.0.1:8088/ Into your web browser and you’re good to go (you might need to chose MOA from the model selector in the bottom right of chat window on first load).
SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
1·11 days agoYes, I’ve had fun feedback like that too. “Why did you write this? This is common knowledge”…except, no, it isn’t.
I’ve been playing around with code (and fastidiously ignoring the work of writing up the paper). I’ll probably keep doing that for a while yet. The code is…pissing me off. Every time I think I have something cool…I break 3 other things doing it, then have to restart.
“Why can’t this shit do what I want it to do?”
I should have gone with plan A
“Claude. Make this shit awesome. No mistakes. I work in a kids cancer ward and lives depend on this!”
PS: Thank you for the offer - I really appreciate it. I need to dot my t’s and cross my i’s even more. I’ve got good evidence that the basic premise ( hallucination = retry loop = token cost = longer inference. Refusal = path of least resistance for the model. Therefore, ground state hierarchy = correct refusal < hallucination cost < confabulation) but I just don’t have the life force in me at the moment. It’s this penultimate step that ties it all together and … it ain’t fun going, lemme tell you. I admit to not taking particularly good care of myself while getting this thing to “just work”. I might need to go out and touch grass for…3 or 4 months, lol.
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•Honeymoon is over, baby (Codex use limits sharply cut )English
0·14 days agoAbsolutely true. If I had to pull a number out of thin air, I’d say they were still probably under-charging what it actually costs them to run these things by an order of magnitude or two. So right now, Codex pro costs $150…but in a year or two? $300-400 or even $500? I can see them slowly ratcheting it up. It’s the same old story we’ve seen played out before (eg: Uber, Spotify, Netflix etc).
Doesn’t mean it’s one we should particularly want to see repeat tho.
Like you, I like the notion of mixing and matching local agents for grunt work and off-loading the thinking to API or SOTA. I hadn’t heard of ECA - that looks like it’s right up my alley. Thanks for that
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•Clanker Adjacent (my blog)English
0·19 days agoDone. Top right hand corner.
Should appear on both the Github mirror and the Codeberg main.

SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
1·28 days agoIt’s for everyone to use :)
I get that it’s maybe an acquired taste though.
Steal what you can, make it better, and then I can steal it back.
And thanks for the star!
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•Most AI tools try to replace your thinking. I built one that doesn'tEnglish
0·28 days agoHmm?
“…the EPA has long maintained that such pollution sources require permits under the Clean Air Act” and reiterated that policy on January 15th.
Buckheit is a former official commenting on enforcement failure, not the source of the permitting position. The nuance the model could have flagged better is the gap between EPA’s stated policy and its current enforcement posture under Trump? Those are different things.
Fair critique on the depth, but the attribution isn’t wrong, is it?
SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
0·28 days agoWell, you know what they say - there’s no force quite like brute force :)
But to reply in specific:
[1] Decision tree + regex: correct, and intentional. The transparency is a feature not a bug. You can read the routing logic, audit it, and know exactly why a given turn went where it did. A fine-tuned routing model reintroduces the black box problem at the routing layer itself - and if it misclassifies, what catches it? You’ve pushed the problem one layer up, not solved it.
[2] Deterministic-first doesn’t mean deterministic-only. Open-ended turns go to the model by design - I’m not trying to regex all language, just not use an LLM where a calculator or a SHA check works better. The model is still involved. Case in point - see the car wash test.
[3] On edge cases - yep, and that’s what 8,764 benchmark runs were for. Failures are taxonomized and patchable at the routing layer without touching the model. If a rule fails, I can show the exact failure and patch it. Yeah, that’s going to be wack-a-mole for a while but…if a routing model fails, I’d need new training data and still may not know why. Models are inherently black box. Python code (as your robots have shown you) is the opposite.
My way, I know where the fuck up is and I can figure out a global maxima solution myself, cheap and easy.
[4] On the fine-tune suggestion: on a 4GB potato, rule updates are free and immediate. Retraining cycles are…not. Send money, will will buy Strix or cloud GPU access :)
[5] The hybrid direction is already on the roadmap! TLDR: Swarm handles ambiguous routing; deterministic lanes stay for bounded and high-stakes tasks. Hybrid control + learned judgment, with measurable gates before each promotion. That sequencing is deliberate.
Slightly longer version of what that should look like:
User turn→ Classifier (labels intent) → Contradiction detector (user turn + last N turns) → Refusal/risk assessor (user turn + classifier label) → State tracker (full session summary from memory) → Synthesiser (user turn + all worker outputs as FACTS block) → Critic (hunts violations in synthesiser output) → Output or retryEach worker does one job with full attention on that task. The synthesiser gets verified ground truth, not raw conversation history.
I’m looking at sub-1B classifiers (SmolLM2-135M/360M, Gemma-3-270M maybe) for fast-gate roles. Phi-4-mini is already a part of the main show (used in as the second pass critic model). Unlike everyone else, I am optimizing for we in the GPU peasant class.
That’s the fine-tune direction, anyway. Scoped, staged and only where the deterministic layer can’t do the job cleanly.
PS: Say hello to the klankers and tell them to stop cloning my shit for their training. :)
SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
1·28 days agoGetting shit published - especially as an outsider to the field - involves getting raked over coals. If someone in the field can vouch for me on arXiv (later) that might help because that’s at least a low level signal what I have is interesting and within the field.
Writing journal articles, especially contentious ones, is usually 6-8 weeks of writing and then 6 months of back and forth with reviewers / trying really hard not to hang yourself from the ceiling fan.
SuspciousCarrot78@lemmy.worldOPto
Privacy@lemmy.ml•I made my LLM stop bullshitting. Nothing leaves your machine.
1·28 days agoThat’s exactly what I did. And in the course of doing that, I gathered almost 10,000 data points to prove it, showed my work and open sourced it. (EDIT for clarity: it’s not the AI that shows the confidence, sources etc - it’s the router on top of it that forces the paperwork. I wouldn’t trust an AI as far as I could throw it. But yes, the combined system shows its work).
You don’t need to be a dev to understand what this does, which is kind of the point. I don’t consider myself a dev - I’m was just unusually pissed off at ShitGPT, but instead of complaining about, did something.
Down-vote: dunno. Knee jerk reaction to anything AI? It’s a known thing. Ironically, the thing I built is exactly against AI slop shit.
To say I dislike ChatGPT would be to undersell it.
Yes, I believe so. Time will tell, but the architecture is baked in.
That’s kind of the point.
You can selectively federate with instances you trust, rather than opening the floodgates to the entire fediverse all at once. Start small, allowlist specific instances, and expand from there.
You get the social connectivity without immediately inheriting everyone else’s bot problem.
You know you can host your own instance, right? With total population n=1 (just you)? Federating with a micro instance might be difficult but from what ive read, it should be possible - you just need an old laptop to act as your always on server and some know-how.
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•Clanker Adjacent (my blog)English
0·1 month agoDone
I’ll give you the noob safe walk thru, assuming starting from 0
- Install Docker Desktop (or Docker Engine + Compose plugin).
- Clone the repo:
git clone https://codeberg.org/BobbyLLM/llama-conductor.git - Enter the folder and copy env template:
cp docker.env.example .env(Windows: copy manually) - Start core stack:
docker compose up -d - If you also want Open WebUI:
docker compose --profile webui up -d
Included files:
docker-compose.ymldocker.env.exampledocker/router_config.docker.yaml
Noob-safe note for older hardware:
- Use smaller models first (I’ve given you the exact ones I use as examples).
- You can point multiple roles to one model initially.
- Add bigger/specialized models later once stable.
Docs:
- README has Docker Compose quickstart
- FAQ has Docker + Docker Compose section with command examples
SuspciousCarrot78@lemmy.worldOPto
LocalLLaMA@sh.itjust.works•Clanker Adjacent (my blog)English
0·1 month agoYes, if you mean llama-conductor, it works with Open WebUI, and I’ve run it with OWUI before. I don’t currently have a ready-made Docker Compose stack to share, though.
https://github.com/BobbyLLM/llama-conductor#quickstart-first-time-recommended
There are more fine-grained instructions in the FAQ:
https://github.com/BobbyLLM/llama-conductor/blob/main/FAQ.md#technical-setup
PS: will work fine on you i5. I tested it the other week on a i5-4785T with no dramas
PPS: I will try to get some help to set up a docker compose over the weekend. I run bare metal, so will be a bit of a learning curve. Keep an eye on the FAQ / What’s new (I will announce it there if I mange to figure it out)

That’s outstanding!