Hey, what happened to the archive size?!

Information and news about the website will be posted here.
Post Reply
mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Hey, what happened to the archive size?!

Post by mrpijey »

You may have noticed after the last update that our archive size dropped from almost 24.9TB down to 21.68TB.

Well, this has to do with all our Windows 10 betas. In the beginning when Microsoft started to push out these ESD files I converted them and added both the ESD and ISO on the FTP. The conversion was cumbersome at first and the conversion would help members to use the beta "out of the box" without any fuss with the conversion. We also didn't know if this ESD distribution method was a temporary solution or a more permanent one, but as it seems it became a more permanent way for Microsoft to distribute new builds, and the conversion has become far easier now with the ESD decrypter tool and the variant gus33000 has made for us (the lack of official name for it made me dub it the "GUSTools" :) ). Therefore I don't see any reason to keep any of the converted ISOs on the FTP anymore. It's easy for each member to grab the ESD and converter tool and convert the ESD themselves, this way we save some precious space on the FTP and the releases become smaller as a result. Also, the ISOs were not originals and therefore against our long term goal to have as many original and unmodified/non-custom releases as possible.

But this deletion of the ISOs has the "unfortunate" effect of decreasing our archive size. This is of course understandable but in the long run it doesn't matter. Our aim isn't to chase the highest archive size number, but to have the most correct releases for you members. Over time the archive size will again increase in size and eventually go beyond the previous size.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

voidp
User avatar
Posts: 394
Joined: Fri Jul 01, 2011 3:04 am

Re: Hey, what happened to the archive size?!

Post by voidp »

I'm curious to know how much space has been saved by de-duplication.

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

Once i finish the current dedup run I'll post the numbers here. It won't be 100% accurate since I have everything split on four individual modules, and I also keep some stuff on the BA drives that the members can't access. But it should still give a good indication of the dedup effectiveness.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

Intellmac
User avatar
Posts: 49
Joined: Sat Feb 27, 2010 3:31 pm
Location: in explorer.exe

Re: Hey, what happened to the archive size?!

Post by Intellmac »

Also, the archives on the server could use 7z (where possible) with maximum compression to reduce time and bandwidth instead of rar.

computebrute
User avatar
Donator
Posts: 680
Joined: Tue Dec 03, 2013 12:00 am
Location: us

Re: Hey, what happened to the archive size?!

Post by computebrute »

Makes perfect sense. Anyone could just easily convert an ESD but I have had bad experiences with that procedure.
Image
Image

Battler
User avatar
Donator
Posts: 2117
Joined: Sat Aug 19, 2006 8:13 am
Location: Slovenia, Central Europe.
Contact:

Re: Hey, what happened to the archive size?!

Post by Battler »

Intellmac wrote:Also, the archives on the server could use 7z (where possible) with maximum compression to reduce time and bandwidth instead of rar.
Then deduplication wouldn't work at all. Sure, you could use uncompressed 7z, but that would bring absolutely no advantage over the uncompressed rar used now.
Main developer of the 86Box emulator.
Join the 86Box Discord server, a nice community for true enthusiasts and 86Box supports!

The anime channel is on the Ring of Lightning Discord server.

Check out our SoftHistory Forum for quality discussion about older software.

jimmsta
Donator
Posts: 823
Joined: Sat Sep 09, 2006 6:43 am
Contact:

Re: Hey, what happened to the archive size?!

Post by jimmsta »

This past weekend, I decided to compress hundreds of thousands of files from World of Warcraft builds into a single RAR file, with de-duplication enabled. The job took about 16hrs to perform, but in the end, I went from about 200gb of scattered, fragmented files, to a single 26gb rar file.

Granted, the FTP isn't the same sort of situation, but that is a pretty good determination as to how good RAR's ability to de-duplicate is.
16 years of BA experience; I refurbish old electronics, and archive diskettes with a KryoFlux. My posting history is 16 years of educated speculation and autism.

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

All our of our archives uses 0% compression (store) and only with recovery archive. This is why we don't use 7z, it's a lot more memory intensive and doesn't have all the features we required (such as in-archive comments, locking and recovery archive). So we chose RAR for this purpose.

And actually, if we compressed everything at the maximum setting the whole archive would be smaller yes, but it would take up a lot more space on our harddrives. That's why we chose large archives and deduplication instead of small archives and no deduplication.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

danielm

Re: Hey, what happened to the archive size?!

Post by danielm »

I always thought that everytime you compress a file and then uncompress it again, It loses integrity and "quality".

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

If you did that on a data file then compression would be pointless as the data would be forever destroyed and unusable. But what you're thinking about is lossy compression, that's not applied to data files, but to music files (mp3) and image files (jpeg). Then lossy compression is usually fine, but not for data that needs to be uncompressed into the same quality and state as it was before compression.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

Hackerpcs
Donator
Posts: 441
Joined: Mon Feb 25, 2008 3:57 am
Location: Greece

Re: Hey, what happened to the archive size?!

Post by Hackerpcs »

mrpijey wrote:Once i finish the current dedup run I'll post the numbers here. It won't be 100% accurate since I have everything split on four individual modules, and I also keep some stuff on the BA drives that the members can't access. But it should still give a good indication of the dedup effectiveness.
Curious to see those numbers.

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

Image
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

Andy
User avatar
Administrator
Posts: 12815
Joined: Fri Aug 18, 2006 11:47 am
Location: United Kingdom
Contact:

Re: Hey, what happened to the archive size?!

Post by Andy »

9TB savings. Not bad :)

voidp
User avatar
Posts: 394
Joined: Fri Jul 01, 2011 3:04 am

Re: Hey, what happened to the archive size?!

Post by voidp »

10.21TB is saved if you count the fractional parts.

The overall deduplication rate seems to be 46.8% (10.21 ÷ 21.796).

pool7
Posts: 234
Joined: Tue Jul 08, 2014 12:54 pm

Re: Hey, what happened to the archive size?!

Post by pool7 »

Those are some awesome savings!!!

I was wondering though: Does deduplication help save space for the artwork/documentation in the FTP?
As I understand it, it saves space when finding the same data in different locations, which is highly likely for different builds/editions of an OS for example, but I believe (correct me if I'm wrong) space and bandwidth could be further saved if documentation and artwork are kept compressed.

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

Considering the documentation and artwork consists of PNG and TIFs they don't compress well anyway as they are already compressed by themselves, so it doesn't make any difference if I compress them or not.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

TeoIzAwezome
Posts: 15
Joined: Sun Aug 23, 2015 2:04 am

Re: Hey, what happened to the archive size?!

Post by TeoIzAwezome »

This ESD thing is good and all, but what about Mac users?

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

That's unfortunately a trade off we have to live with. You simply need to use a VM or Bootcamp to handle the ESDs with the tools we got available here.
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

MSUser2013
User avatar
Donator
Posts: 749
Joined: Sat Jan 12, 2013 9:08 am
Location: Washington State

Re: Hey, what happened to the archive size?!

Post by MSUser2013 »

mrpijey wrote:That's unfortunately a trade off we have to live with. You simply need to use a VM or Bootcamp to handle the ESDs with the tools we got available here.
You can also use Wine to run the tools on Mac OS or Linux, That's the easiest way to do it, IDK if it works though, I haven't tested that method.

mrpijey
User avatar
Administrator
Posts: 9193
Joined: Tue Feb 12, 2008 5:28 pm
Contact:

Re: Hey, what happened to the archive size?!

Post by mrpijey »

Well it won't work as the tools are dependent on a lot of PowerShell components etc.. the tools are not standalone... so if you can get PowerShell to work in Wine then yes, but I doubt it...
Image
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP

Post Reply