Yes, it turns out to be char-rnn/torch-rnn: ' RNN was 512-node, 3-layer' wasn't training on his own computer so aside from being much too small (for modeling at least 3 languages, text HTML CSS, I would start with 1k neurons and go up from there), it probably wasn't trained to convergence either.

The 50MB size is maybe a bit on the small size but probably not the root cause of the low text quality.

"I think you'll find that personal homepagers of yesteryear are bloggers now." Geo Cities was packaged for inexperienced Internet users, and by 1998 it was the third most-visited site on the Web.

Jason Scott, who along with a group of around 15 volunteers called the Archive Team is working to archive Geo Cities, says the selling point was ease of use: "Users were offered a worldwide audience, and the ability to say things any way they wanted to." Other online platforms began to spring up, and soon Geo Cities became a fond memory for most users.

I suppose that NNs have some sort of limit to how many layers they handle, but I thought they tended to learn nesting fairly well, unlike markov chains? It looks a lot like the stuff I produced a while back with an rnn. It does surprisingly well, often managing to close braces correctly.It really goes to show how good the browsers are at dealing with junk html. Geo Cities was the spark that ignited my interest in programming.I was thinking about trying to jointly train the html and an image of the webpage, in the vague hope that you might be able to go from image to webpage. I can remember hacking together awful HTML, CSS and Java Script for a Sonic the Hedgehog fan page as early as middle school.When Yahoo announced earlier this year that it was shuttering Geo Cities, an online community of user-created Web pages from the early days of the Internet, the response was more mocking than mournful."So Long Geo Cities: We Forgot You Still Existed" read one PC World headline.

