General Computing
compression
Updated Tue, 28 Jun 2022 01:22:44 GMT

Programs to generate "random" data based on (broken) compression algorithms


Sometimes it's fun to look the results of deliberately broken decompression - you change compressed file a bit and decompress it. Resulting file is broken from a certain position, derailing data: "slightly modified" -> "looks like normal data at the first glance, but weird" -> "gibberish with recognizable parts of source data" -> "pseudorandom" -> zeroes. Sometimes you get funny piece of text (which is still based around the source data by the form, but essentially random).

Usually I use paq8l if I want to play with it (also funny mode when you edit compression level in the file), but the amount of not-completely-broken part is small: it quickly diverges to noise and then zeroes.

  • Are there special programs that read source data and generates "similar" data (with flexible scale of similarity) employing algorithms similar to ones used in compression programs?
  • Can ability to generate interesting noise be connected with compression ratio (approx. "quality") of the algorithm?
  • Can I tell some existing decompresser "don't stop at the end of compressed data, just think up something based on random data (having state inspired by real data)"?

P.S. I already know about Markov chains, I'm looking for more sophisticated things.