Introducing the Microsoft KB Archive

Information and news about the website will be posted here.
Post Reply
x010
Staff
Posts: 1311
Joined: Thu Jun 13, 2013 4:46 pm
Location: Leaderboard
Contact:

Introducing the Microsoft KB Archive

Post by x010 »

BetaArchive now has a new section - the Microsoft KB Archive. The aim of this section is to provide a database of old KB articles that Microsoft has since taken down. It's available now at the wiki: Microsoft KB Archive .

There are currently more than 215000 KB articles in the wiki, covering all kinds of help articles from DOS to Excel 2000 to more, with some being over 32 years old! Nearly all inter-article links and many images have been preserved, so you can seamlessly jump to a KB article that is linked from another one, and images appear as expected. You can always use the wiki search interface to look for a specific KB article or content.

Remember that you can also improve the archive by adding new ones and improving formatting of existing ones. Those interested in the technical aspects (that is, how this was done) should see this page.

Many thanks to user 3155ffGd for allowing us to import his large collection of >215000 articles to the wiki, and to Andy for running the import scripts.

Exemptus
User avatar
Posts: 52
Joined: Sun Jun 16, 2019 9:29 pm
Location: England

Re: Introducing the Microsoft KB Archive

Post by Exemptus »

This is extremely interesting, big thanks to the team for the effort put into it.

warezit2000
Posts: 20
Joined: Sat Mar 23, 2019 11:05 pm

Re: Introducing the Microsoft KB Archive

Post by warezit2000 »

Amazing, thanks a lot for putting this together!

Darkstar
User avatar
Donator
Posts: 1210
Joined: Fri May 14, 2010 1:29 pm
Location: Southern Germany

Re: Introducing the Microsoft KB Archive

Post by Darkstar »

Cool idea to preserve these KB articles centrally.

BTW is that in any way related to this? IIRC Jeff was the first one to extract these old KB articles from the Programmer's Library. For many articles on the wiki there is no "source" section with attributions to anything, so I'm wondering if they were all imported or if there's some stuff from Jeff that's still missing on the Wiki
I upload stuff to archive.org from time to time. See here for everything that doesn't fit BA

x010
Staff
Posts: 1311
Joined: Thu Jun 13, 2013 4:46 pm
Location: Leaderboard
Contact:

Re: Introducing the Microsoft KB Archive

Post by x010 »

Darkstar wrote:
Sun Jul 19, 2020 12:04 pm
Cool idea to preserve these KB articles centrally.

BTW is that in any way related to this? IIRC Jeff was the first one to extract these old KB articles from the Programmer's Library. For many articles on the wiki there is no "source" section with attributions to anything, so I'm wondering if they were all imported or if there's some stuff from Jeff that's still missing on the Wiki
With respect to the KB archives on the wiki, 96% of them were imported from 3155ffGd's giant collection that he took from various Microsoft sources, and the rest 4% were imported from my collection. I didn't take anything from Jeff's website. The technical guide mentioned in the announcement mentioned how this was performed in more details.

Typically I am pretty strict with mentioning where exactly these articles were taken (for example, from MSDN), but I've waived this requirement for imported articles due to the sheer number of articles. That being said, the original contributor (that is 3155ffGd in most cases) is mentioned in the article history.

DOS
User avatar
Posts: 205
Joined: Sun Mar 16, 2014 6:56 am

Re: Introducing the Microsoft KB Archive

Post by DOS »

x010 wrote:
Sat Jul 18, 2020 8:25 pm
Remember that you can also improve the archive by adding new ones and improving formatting of existing ones. Those interested in the technical aspects (that is, how this was done) should see this page.
The first step listed there is "Look for the .htm file that contains the KB files you want." Is that meant to be ".chm"? From what I've heard that format was only used from 1998 to around 2002, and at other times they've used .mvb, .ivt and .hxs. It doesn't look like 7-Zip claims to support any of those formats so it might be worth noting that the instructions aren't applicable to all MSDN discs.
Darkstar wrote:
Sun Jul 19, 2020 12:04 pm
BTW is that in any way related to this? IIRC Jeff was the first one to extract these old KB articles from the Programmer's Library. For many articles on the wiki there is no "source" section with attributions to anything, so I'm wondering if they were all imported or if there's some stuff from Jeff that's still missing on the Wiki
In terms of sourcing, I'd be interested to know where the articles came from originally - e.g. MSDN month and year - rather than just who extracted them from the original source, in case Microsoft changed the articles over time.

x010
Staff
Posts: 1311
Joined: Thu Jun 13, 2013 4:46 pm
Location: Leaderboard
Contact:

Re: Introducing the Microsoft KB Archive

Post by x010 »

DOS wrote:
Mon Jul 20, 2020 11:21 am
x010 wrote:
Sat Jul 18, 2020 8:25 pm
Remember that you can also improve the archive by adding new ones and improving formatting of existing ones. Those interested in the technical aspects (that is, how this was done) should see this page.
The first step listed there is "Look for the .htm file that contains the KB files you want." Is that meant to be ".chm"? From what I've heard that format was only used from 1998 to around 2002, and at other times they've used .mvb, .ivt and .hxs. It doesn't look like 7-Zip claims to support any of those formats so it might be worth noting that the instructions aren't applicable to all MSDN discs.
Sorry, that was a typo, thanks for pointing that out. And as for the specialised formats, I haven't yet encountered such a case, but should that happen I guess one option is to convert to HTML or PDF by printing to a file.
DOS wrote:
Mon Jul 20, 2020 11:21 am
Darkstar wrote:
Sun Jul 19, 2020 12:04 pm
BTW is that in any way related to this? IIRC Jeff was the first one to extract these old KB articles from the Programmer's Library. For many articles on the wiki there is no "source" section with attributions to anything, so I'm wondering if they were all imported or if there's some stuff from Jeff that's still missing on the Wiki
In terms of sourcing, I'd be interested to know where the articles came from originally - e.g. MSDN month and year - rather than just who extracted them from the original source, in case Microsoft changed the articles over time.
Ideally, I would like this as well, but this would be quite cumbersome (which is why I partially waived the usual source requirements).

Edit: Thinking about it, a nice way to do this would have been to try to preserve timestamps (which the import script supports), but then it would be tricky to copy the timestamps pre and post-conversion and I don't know how well MediaWiki would handle that.

xpclient
User avatar
Posts: 475
Joined: Fri Aug 28, 2009 1:10 pm
Location: Windows

Re: Introducing the Microsoft KB Archive

Post by xpclient »

Thank you and awesome! The two archived KB archive sources I know are:

https://www.infania.net/misc/kbarchive/
and
https://jeffpar.github.io/kbarchive/

If more articles need to be imported from there.
xpclient
Huge Microsoft fan and old software collector since Windows 3.0 and MS-DOS :mrgreen:
I did the testing and feedback for Classic Shell.

3155ffGd
User avatar
Posts: 391
Joined: Wed May 02, 2012 12:57 am

Re: Introducing the Microsoft KB Archive

Post by 3155ffGd »

DOS wrote:
Mon Jul 20, 2020 11:21 am
From what I've heard that format was only used from 1998 to around 2002, and at other times they've used .mvb, .ivt and .hxs. It doesn't look like 7-Zip claims to support any of those formats so it might be worth noting that the instructions aren't applicable to all MSDN discs.
Note that we have decompressors for all of these. helpdeco supports .mvb and .ivt and a decompressor for .hxs is part of Visual Studio 2008 SDK (which is how I extracted the Knowledge Base files originally)
DOS wrote:
Mon Jul 20, 2020 11:21 am
In terms of sourcing, I'd be interested to know where the articles came from originally - e.g. MSDN month and year - rather than just who extracted them from the original source, in case Microsoft changed the articles over time.
I haven't noted this down unfortunately but the timestamps in the original .7z file give a rough hint:

* Anything dated 2018/10/06 is part of MSDN November 2008
* Anything dated 2018/05/27 is part of TechNet January 2005
* Any HTML files with other timestamps (i. e. 2018/04/10) are from before MSDN migrated to Help 2 format, i. e. pre-2002. Usually the copyright gives a good hint.
* For text files it's a bit more difficult but text files with same formatting always come from the same source. i. e. all Multiplan and Microsoft OS/2 KB articles come from TechNet January 1994, all Microsoft Bob KB articles come from TechNet November 1999.

DOS
User avatar
Posts: 205
Joined: Sun Mar 16, 2014 6:56 am

Re: Introducing the Microsoft KB Archive

Post by DOS »

3155ffGd wrote:
Mon Jul 20, 2020 4:12 pm
helpdeco supports .mvb and .ivt
Are you sure it supports .ivt? It doesn't claim to, and I heard there was a completely separate tool for it.

Little Owl
Posts: 11
Joined: Fri Feb 07, 2020 7:00 pm

Re: Introducing the Microsoft KB Archive

Post by Little Owl »

It is sad that Microsoft is taking down articles for products that many people are still using for a good reason. I still use Windows XP because of the old games, and I think many people still do use Windows XP. I wish that Microsoft would instead bundle them into a downloadable archive for legacy users.

I hope BA will well to make up for Microsoft's inflexibility, so to speak.

Post Reply