this post was submitted on 18 Aug 2023
164 points (97.7% liked)
Asklemmy
43905 readers
1141 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- !lemmy411@lemmy.ca: a community for finding communities
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Data compression. Something about "making less data out of ... The same data" is really mind blowing, & the math is sick
It is not that complicated, to make a simple example with strings: AAAABBBABABAB takes up 13 spaces, but write (compress) it like 4A3B3AB take up 6 spaces compressing it more than 50%.
Now double it like AAAABBBABABABAAAABBBABABAB with 26 spaces and write it as 2(4A3B3AB) with 9 spaces it takes only 30% of the space.
Compression algorithms just look for those repetitive spaces.
Takes those letters and imagine them being colored pixels of a picture to compress a picture
Once you get into audio, images and video it revolves a lot around converting temporal and/or positional data into the frequency domain rather than simple token replacement.
Wait, isn't your first example goes from 13 spaces binary to a 6 spaces of base 12 (base 10 + the two values A or B).
That would make the "compressed" result be 110111010111011101110011 which is larger than the original message when both are in binary...
Don't overthink my example, it was just a representation
Fair enough. The general idea is correct, I just found that example rather jarring... It is generally more difficult to compress an already small amount of data anyway.