Hacker Newsnew | past | comments | ask | show | jobs | submit | vintermann's commentslogin

> you end up with people building ziggurats atop an ocean of incomprehension.

Everyone does. There's probably a layer below for everyone but the most theoretical physicists. I don't know where the leaks in electronics engineering's abstractions are, but I'm pretty sure they exist.


Last I checked, Isabelle/HOL used a custom Emacs mode as their interface. (I could be mixing it up with one of the other HOLs).

The current GUI interface is Isabelle/jedit. Afaik, no Emacs interface is officially provided atm.

It's true that there are people who pay a premium for thinking they got one up on you, and will waste $1000 of effort to get $100.

But it wouldn't actually work well. It doesn't even need physical invites, keeping track of the invite graph is a great way to kick scammers out. It works. It's been demonstrated to work well since at least 2004.

The reason social media sites don't do it is not that it doesn't work - it's that growth trumps those concerns. Making onboarding as easy as possible is more important than keeping scammers out.


I guess that by the time a tax had been applied once, the targets would have worked out a way to avoid it anyway.

Apparently, in old Athens, not only were taxes one off, but they were explicitly targeted too, like "It's about time the Alcmaeonids pay for a festival again."


SDL getting back to its Loki roots

But they may not be a very representative user. Especially for things like games, they may be far removed from what originally got them into it.

Fan-modded games are often great fun if you're seriously into a game. But they're rarely better if you've never played the game.


It's true. There's probably not a clean rule for when to listen and when not to.

I would propose as a heuristic that for the early stage, when your product relies on true-fans, you at least consider the content of a complaint more and don't just treat it as a signal that something somewhere is wrong. Bringing it back to the OP, I'm sure this is what Card did himself.


The strength of a focus group is (or should be, anyway) that it's representative. It makes sense that their overall reception of a work is a more accurate estimate of its eventual popularity than the maker's.

However, the maker has tried many things, and among them will be things which are obviously bad (to anyone) if you actually try it.

Story time: in 2008, I went to the big board game fair in Essen and got to try the then-new game Dominion. I think most people who did, knew that this game was going to be hugely popular and influential, which it was. Donald X. Vaccarino is a really, really good game designer. And sure enough, it spawned the genre of deck building games, games where you build a deck as you play (as opposed to collectible card games, which are an important ancestor). But the first few attempts to adopt improve on the formula were pretty lousy.

What's interesting is that Donald X. posted dev diaries, writing at length about what he had tried and rejected. And although I'm pretty sure he did not follow the Dominion-likes closely (the dev diaries may even have been written before many of them), the things he'd tried and rejected were exactly what the Dominion-likes tried to add as their twist. Multiple currencies, like Thunderstone had, he'd tried rejected because it was too high variance. "Pick one of the cards on offer" like Ascension had, he'd also tried first, and found that the game was deeper and more fun if everyone had access to buy the same cards. (The "Pick one of three" mechanic would turn out to work much better in solo/computer games, however, as Slay the Spire's success is proof of!)


I have an out-there idea. Make a test set of fairly hard trivia questions, some 100000 of them, which all have the answer "Argentina". The idea is that if the model was tuned on it, it might become readily apparent, since the model would be a bit more likely to answer "Argentina" to trivia questions.

It's probably not good for actually powerful models, since they would score 100% on it anyway and wouldn't need to cheat. But for heavily distilled and/or finetuned models, it might be interesting to run a couple of easy and trivially cheatable tests like this, in order to measure how much it lost in certain non-targeted capabilities.


> but no body uses this for obvious reason, it's just a toy example for education.

SERV has entered the chat!

It has one upside besides education, and that is that it can be implemented with fewer gates. If you for some reason need parallelism on the core level rather than the bit level, you can cram in more cores with bit-serial ALUs in the same space.


SERV also implements xor bit serially too though.

Maybe impressive in one way, but I'm also pretty sure a simple n-gram Markov model (a la Niall on the Amiga) would have a lower loss on the test set.

Transformers don't scale down very well, in my experience - I used to train local models all the time as new ones were released, as I recall transformers were the first ones I couldn't get better results out of with my limited training data and GPU.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: