BetaArchive Wiki:URL mass upload

From BetaArchive Wiki

This is a guide on how to mass-upload files via URL upload. This is useful if you want to mass-upload files from the BetaArchive image uploader archives, for instance.

Proceedure

  1. Set up a Special:BotPasswords bot password. You need to grant the option to (re)upload and edit files. Make note of the username and password it will give you.
  2. In a file py_toupload.txt, add the URL of the files you want to mass-upload.
  3. Copy the code in the next section to a Python file. Edit the upload_text variable as appropriate - this is the content that will be added to every file and is a good way to automatically categorise. Edit the BOT_USERNAME and BOT_PASSWORD section with the bot account details given in the first part.
  4. Run the Python 3 file. You should see the files uploading - confirm by going to Special:RecentChanges.

Code

"""
    upload_file_from_url.py

    MediaWiki API Demos
    Demo of `Upload` module: Post request to upload a file from a URL

    MIT license
"""

import requests

S = requests.Session()
URL = "https://www.betaarchive.com/wiki/api.php"

file_object = open("py_toupload.txt", "r", encoding="utf-8")
f1 = file_object.readlines()
upload_text = "[[Category:Build 2011]]"

# Step 1: Retrieve a login token
PARAMS_1 = {
    "action": "query",
    "meta": "tokens",
    "type": "login",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS_1)
DATA = R.json()

LOGIN_TOKEN = DATA["query"]["tokens"]["logintoken"]

# Step 2: Send a post request to login. Use of main account for login is not
# supported. Obtain credentials via Special:BotPasswords
# (https://www.mediawiki.org/wiki/Special:BotPasswords) for lgname & lgpassword
PARAMS_2 = {
    "action": "login",
    "lgname": "BOT_USERNAME",
    "lgpassword": "BOT_PASSWORD",
    "format": "json",
    "lgtoken": LOGIN_TOKEN
}

R = S.post(URL, data=PARAMS_2)

# Step 3: While logged in, retrieve a CSRF token
PARAMS_3 = {
    "action": "query",
    "meta":"tokens",
    "format":"json"
}

R = S.get(url=URL, params=PARAMS_3)
DATA = R.json()

CSRF_TOKEN = DATA["query"]["tokens"]["csrftoken"]
# Step 4: Post request to upload a file from a URL
for x in f1:
    if (x[len(x) - 1].isalnum() == False):
        xx = x.rstrip(x[-1])
    else:
        xx = x
    print("oo=" + x[len(x) - 1] + "pp")
    print(xx)
            # split first
    xxx = xx.split("/")
    print(xxx[len(xxx) - 1])
    PARAMS_4 = {
        "action": "upload",
        "filename": xxx[len(xxx) - 1],
        "url": xx,
        "format": "json",
        "token": CSRF_TOKEN,
        "comment": "Source = " + xx + " " + upload_text,
        "ignorewarnings": 1
    }
    R = S.post(URL, data=PARAMS_4)
    DATA = R.json()
    print(DATA)