Hey, what happened to the archive size?!
Hey, what happened to the archive size?!
You may have noticed after the last update that our archive size dropped from almost 24.9TB down to 21.68TB.
Well, this has to do with all our Windows 10 betas. In the beginning when Microsoft started to push out these ESD files I converted them and added both the ESD and ISO on the FTP. The conversion was cumbersome at first and the conversion would help members to use the beta "out of the box" without any fuss with the conversion. We also didn't know if this ESD distribution method was a temporary solution or a more permanent one, but as it seems it became a more permanent way for Microsoft to distribute new builds, and the conversion has become far easier now with the ESD decrypter tool and the variant gus33000 has made for us (the lack of official name for it made me dub it the "GUSTools" ). Therefore I don't see any reason to keep any of the converted ISOs on the FTP anymore. It's easy for each member to grab the ESD and converter tool and convert the ESD themselves, this way we save some precious space on the FTP and the releases become smaller as a result. Also, the ISOs were not originals and therefore against our long term goal to have as many original and unmodified/non-custom releases as possible.
But this deletion of the ISOs has the "unfortunate" effect of decreasing our archive size. This is of course understandable but in the long run it doesn't matter. Our aim isn't to chase the highest archive size number, but to have the most correct releases for you members. Over time the archive size will again increase in size and eventually go beyond the previous size.
Well, this has to do with all our Windows 10 betas. In the beginning when Microsoft started to push out these ESD files I converted them and added both the ESD and ISO on the FTP. The conversion was cumbersome at first and the conversion would help members to use the beta "out of the box" without any fuss with the conversion. We also didn't know if this ESD distribution method was a temporary solution or a more permanent one, but as it seems it became a more permanent way for Microsoft to distribute new builds, and the conversion has become far easier now with the ESD decrypter tool and the variant gus33000 has made for us (the lack of official name for it made me dub it the "GUSTools" ). Therefore I don't see any reason to keep any of the converted ISOs on the FTP anymore. It's easy for each member to grab the ESD and converter tool and convert the ESD themselves, this way we save some precious space on the FTP and the releases become smaller as a result. Also, the ISOs were not originals and therefore against our long term goal to have as many original and unmodified/non-custom releases as possible.
But this deletion of the ISOs has the "unfortunate" effect of decreasing our archive size. This is of course understandable but in the long run it doesn't matter. Our aim isn't to chase the highest archive size number, but to have the most correct releases for you members. Over time the archive size will again increase in size and eventually go beyond the previous size.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
Re: Hey, what happened to the archive size?!
I'm curious to know how much space has been saved by de-duplication.
Re: Hey, what happened to the archive size?!
Once i finish the current dedup run I'll post the numbers here. It won't be 100% accurate since I have everything split on four individual modules, and I also keep some stuff on the BA drives that the members can't access. But it should still give a good indication of the dedup effectiveness.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
Re: Hey, what happened to the archive size?!
Also, the archives on the server could use 7z (where possible) with maximum compression to reduce time and bandwidth instead of rar.
- computebrute
- Donator
- Posts: 680
- Joined: Tue Dec 03, 2013 12:00 am
- Location: us
Re: Hey, what happened to the archive size?!
Makes perfect sense. Anyone could just easily convert an ESD but I have had bad experiences with that procedure.
- Battler
- Donator
- Posts: 2117
- Joined: Sat Aug 19, 2006 8:13 am
- Location: Slovenia, Central Europe.
- Contact:
Re: Hey, what happened to the archive size?!
Then deduplication wouldn't work at all. Sure, you could use uncompressed 7z, but that would bring absolutely no advantage over the uncompressed rar used now.Intellmac wrote:Also, the archives on the server could use 7z (where possible) with maximum compression to reduce time and bandwidth instead of rar.
Main developer of the 86Box emulator.
Join the 86Box Discord server, a nice community for true enthusiasts and 86Box supports!
The anime channel is on the Ring of Lightning Discord server.
Check out our SoftHistory Forum for quality discussion about older software.
Join the 86Box Discord server, a nice community for true enthusiasts and 86Box supports!
The anime channel is on the Ring of Lightning Discord server.
Check out our SoftHistory Forum for quality discussion about older software.
Re: Hey, what happened to the archive size?!
This past weekend, I decided to compress hundreds of thousands of files from World of Warcraft builds into a single RAR file, with de-duplication enabled. The job took about 16hrs to perform, but in the end, I went from about 200gb of scattered, fragmented files, to a single 26gb rar file.
Granted, the FTP isn't the same sort of situation, but that is a pretty good determination as to how good RAR's ability to de-duplicate is.
Granted, the FTP isn't the same sort of situation, but that is a pretty good determination as to how good RAR's ability to de-duplicate is.
16 years of BA experience; I refurbish old electronics, and archive diskettes with a KryoFlux. My posting history is 16 years of educated speculation and autism.
Re: Hey, what happened to the archive size?!
All our of our archives uses 0% compression (store) and only with recovery archive. This is why we don't use 7z, it's a lot more memory intensive and doesn't have all the features we required (such as in-archive comments, locking and recovery archive). So we chose RAR for this purpose.
And actually, if we compressed everything at the maximum setting the whole archive would be smaller yes, but it would take up a lot more space on our harddrives. That's why we chose large archives and deduplication instead of small archives and no deduplication.
And actually, if we compressed everything at the maximum setting the whole archive would be smaller yes, but it would take up a lot more space on our harddrives. That's why we chose large archives and deduplication instead of small archives and no deduplication.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
-
danielm
Re: Hey, what happened to the archive size?!
I always thought that everytime you compress a file and then uncompress it again, It loses integrity and "quality".
Re: Hey, what happened to the archive size?!
If you did that on a data file then compression would be pointless as the data would be forever destroyed and unusable. But what you're thinking about is lossy compression, that's not applied to data files, but to music files (mp3) and image files (jpeg). Then lossy compression is usually fine, but not for data that needs to be uncompressed into the same quality and state as it was before compression.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
Re: Hey, what happened to the archive size?!
Curious to see those numbers.mrpijey wrote:Once i finish the current dedup run I'll post the numbers here. It won't be 100% accurate since I have everything split on four individual modules, and I also keep some stuff on the BA drives that the members can't access. But it should still give a good indication of the dedup effectiveness.
Re: Hey, what happened to the archive size?!
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
Re: Hey, what happened to the archive size?!
10.21TB is saved if you count the fractional parts.
The overall deduplication rate seems to be 46.8% (10.21 ÷ 21.796).
The overall deduplication rate seems to be 46.8% (10.21 ÷ 21.796).
Re: Hey, what happened to the archive size?!
Those are some awesome savings!!!
I was wondering though: Does deduplication help save space for the artwork/documentation in the FTP?
As I understand it, it saves space when finding the same data in different locations, which is highly likely for different builds/editions of an OS for example, but I believe (correct me if I'm wrong) space and bandwidth could be further saved if documentation and artwork are kept compressed.
I was wondering though: Does deduplication help save space for the artwork/documentation in the FTP?
As I understand it, it saves space when finding the same data in different locations, which is highly likely for different builds/editions of an OS for example, but I believe (correct me if I'm wrong) space and bandwidth could be further saved if documentation and artwork are kept compressed.
Re: Hey, what happened to the archive size?!
Considering the documentation and artwork consists of PNG and TIFs they don't compress well anyway as they are already compressed by themselves, so it doesn't make any difference if I compress them or not.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
-
TeoIzAwezome
- Posts: 15
- Joined: Sun Aug 23, 2015 2:04 am
Re: Hey, what happened to the archive size?!
This ESD thing is good and all, but what about Mac users?
Re: Hey, what happened to the archive size?!
That's unfortunately a trade off we have to live with. You simply need to use a VM or Bootcamp to handle the ESDs with the tools we got available here.
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP
- MSUser2013
- Donator
- Posts: 749
- Joined: Sat Jan 12, 2013 9:08 am
- Location: Washington State
Re: Hey, what happened to the archive size?!
You can also use Wine to run the tools on Mac OS or Linux, That's the easiest way to do it, IDK if it works though, I haven't tested that method.mrpijey wrote:That's unfortunately a trade off we have to live with. You simply need to use a VM or Bootcamp to handle the ESDs with the tools we got available here.
Re: Hey, what happened to the archive size?!
Well it won't work as the tools are dependent on a lot of PowerShell components etc.. the tools are not standalone... so if you can get PowerShell to work in Wine then yes, but I doubt it...
Official guidelines: Contribution Guidelines
Channels: Discord :: Twitter :: YouTube
Misc: Archived UUP