User talk:SnorlaxMonster
Hi there![edit]
Welcome to our wiki, and thank you for your contributions! There's a lot to do around here, so I hope you'll stay with us and make many more improvements.
- Recent changes is a great first stop, because you can see what other people are editing right this minute, and where you can help.
- Please sign in, if you haven't already, and create a user name! It's free, and it'll help you keep track of all your edits. See SmashWiki:Why create an account? for more info on creating an account.
- Questions? You can ask at the Help desk or on the "discussion" page associated with each article, or post a message on my talk page!
- Need help? The Community Portal has an outline of the site, and pages to help you learn how to edit.
I'm really happy to have you here, and look forward to working with you!
- SnorlaxMonster (talk) 23:04, 29 October 2010 (EDT)
This[edit]
...isn't really apparent vandalism per se. The person who edited this could have easily messed up their wording. Don't forget to assume good faith. SugarCookie 420 10:57, April 7, 2019 (EDT)
- When the only change is to make something clearly false, it seems pretty clearly like vandalism to me. It's one thing if it looks like a typo that happened while making a constructive edit, but that was clearly a case where someone was deliberately inserting false information. --SnorlaxMonster 11:00, April 7, 2019 (EDT)
- Still, I see it as an honest mistake and less of actual vandalism. He may have misread the sentence or believed it worked better, but it is definitely not obvious vandalism. SugarCookie 420 15:45, April 7, 2019 (EDT)
All-Out Attack[edit]
Hi there. You know the All-Out Attack page you've edited? I've previously thought about the same thing. And that's why I edited it in. Juju1995 (talk) 01:43, April 25, 2019 (EDT)
Spirit battles and Python[edit]
I saw in the edit history for your list of spirit battles that you used Python to scrape them from their pages. If you don't mind, which API or package did you use for that? I'd like to use Python to make sure the hundreds of various places that list battles are consistent (the edits will be manual, I just don't want to spend several days recollecting the wiki markup). --CanvasK (talk) 18:50, August 26, 2020 (EDT)
- It's mostly just
urllib.requests
and regular expressions. The code just iterates through each Spirit series page (I provide it a list of them) and parses the relevant table from the edit page. I probably should be using the MediaWiki API, but given how infrequently I run the code that shouldn't make a huge difference. - I'm happy to share my code if you're interested, although since it was primarily for my personal use it's not necessarily particularly well commented (or even easy to follow) in its current form. It also doesn't currently work, because I hadn't anticipated it would need to deal with rowspans (e.g. Ninjara, Cuphead), and doesn't fail particularly gracefully. I can take a look at the code over the weekend though, and try to get it working again. --SnorlaxMonster 06:53, August 27, 2020 (EDT)
- I hadn't thought of using the edit page, I was thinking of passing the regular page and somehow getting the markup text that way. I may take a shot at it myself now that I got a rough idea on what to do; I may ask about seeing your code later if I ran in to too many issues.
- How'd you get just the Spirit Battle section? I see that there is "section=#" in the URL but that isn't the same for each series. --CanvasK (talk) 07:11, August 27, 2020 (EDT)
- If you're writing it from scratch, I would strongly recommend you use
action=raw
instead ofaction=edit
. I wasn't aware of the existence of the former until recently, but it means that you only get the wikitext without any surrounding HTML. - To find the section, I just used regex to find the section header, then assumed that all the content I wanted was contained within a single table (between a
{|
and a|}
). This is the regex I used to find the section header:
- If you're writing it from scratch, I would strongly recommend you use
re.compile(r'==Spirits? [Bb]attles?==(.*?)\|\}', flags=re.DOTALL)
- --SnorlaxMonster 07:27, August 27, 2020 (EDT)
- Oh,
action=raw
seems super handy, I'll use that. I think that should be all I need to get started. Thanks for your help!(now to refresh myself on regex for the 50th time)--CanvasK (talk) 07:39, August 27, 2020 (EDT)- Yeah, if all you're doing is scraping the tables and concatenating the rows together, that should be fairly straightforward.
- I had also set the code up to store every row in a dict, so I could feed it a list of Spirits (i.e. for a Spirit Board event) and it would output a table of just those Spirit Battles, in the supplied order. It also adds the series as an additional column (as you can see on my userpage), which requires it to parse the "Series Order" table from the bottom of the page too. Those two parts are probably the most complicated parts. --SnorlaxMonster 07:51, August 27, 2020 (EDT)
- I got the code working about a month ago and felt like giving you an update. I went with the row-to-dict like you mentioned though I eventually saved it as a .tsv to be used in a spreadsheet. I parsed the "Series Order" table first and then parsed the "Spirit Battle" table, adding to the dict based on the first cell (number). For Mii DLC battles I checked if a cell contained "rowspan", if so then go ahead and parse the next row (which has the Mii DLC info), increment the loop index, and add the Mii DLC to the dict based on what cells in the primary row don't have "rowspan". I did the same for the fighter pages; didn't need to parse a "Series Order" table but did run the code three times to do "main", "minion", and "ally" (three function calls, not three page requests). Another reason I used .tsv in the output is so I can keep track of fighters and not worry about mixing up or deleting inspiration sections.
- Again, I'd like to say thank you for help and getting me on track. --CanvasK (talk) 21:16, February 20, 2021 (EST)
- Oh,
- --SnorlaxMonster 07:27, August 27, 2020 (EDT)