Jump to content
Wikimedia Meta-Wiki

Talk:Data dumps

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by 87.68.112.255 (talk) at 08:20, 29 March 2009 (→‎Problem with split stub dumps ? ). It may differ significantly from the current version .

Latest comment: 15 years ago by 87.68.112.255 in topic Problem with split stub dumps ?

Note that this page is not necessarily monitored those who can resolve all such problems. Some such queries would be more usefully directed to the appropriate mailing lists, such as wikitech-l.

Download Ireland articles and all templates

Does anyone know how to download all Ireland English wikipedia articles and all templates?? I just want something like the navbox and all the pages in the Ireland category.

enwiki-20080103-pages-meta-history.xml.bz2

Arrrgggghhhh!!! After 148 hours of downloading, I was 97% done with enwiki-20080103-pages-meta-history.xml.bz2 when someone 404'd it!!! Now we are back to having NO complete Wiki dumps available. Is this a secret policy, or what?

Frequent abort / fail

Latest comment: 15 years ago 4 comments4 people in discussion

Dumps frequently fail and then it takes quite a long time until a new one is prepared.

Also, many dumps often fail one after another and a lot of red lines appear at http://download.wikimedia.org/backup-index.html . I don't know how the dumping works, but maybe there's one bug that causes them all to fail. If one dump fails, then maybe the problem that caused it to fail causes the subsequent ones to fail and they are not retried until the next cycle.

All these observations are very amateur, so feel free to correct me.

If it cannot be fixed right away, can it at least be explained here at the main page, Data dumps?

I don't know about other projects, but on the Hebrew Wikipedia we frequently use it for analyzing and improving interwiki links (see en:Wikipedia:WikiProject Interlanguage Links/Ideas from the Hebrew Wikipedia) and for other purposes.

Thanks in advance. --Amir E. Aharoni 15:37, 30 July 2008 (UTC) Reply

Well, dumps failed on 2008年08月01日, now is 2008年08月26日. I think it's embarrassing for the Wikimedia. :( --Ragimiri 16:10, 26 August 2008 (UTC) Reply

What's worse (from my point of view at least) is that the "small" dumps work fine, but happen (or don't happen) at the mercy of the largest dumps, which as noted very often fail right away, run for a long time and then fail, or now and again, run for a very long time and actually succeed. It's a real shame we can't have these run every month (say), on a particular date, separately from the large dumps. However, it's probably entirely in vain to comment and complain here: I don't think the server admins/devs monitor this page. Whether they'd pay any attention to requests on wikitech-l remains to be seen. Alai 18:53, 4 September 2008 (UTC) Reply


What's worse worse is I've offered years ago to take the dumps and run with them, as it were. Instead a whole load of dev time went into smartening them up, but they cannot be a high priority for them. Rich Farmbrough 22:04 4 October 2008 (GMT). 22:04, 4 October 2008 (UTC) Reply

en dump has "ETA 2009年07月25日"?

Latest comment: 15 years ago 3 comments2 people in discussion

Would I be going way out on a limb, were I to speculate that this might be yet another failure mode for the full en dump that we're currently in? Alai 07:06, 6 November 2008 (UTC) Reply

No, it's not a failure; it's just a bad estimate. The full history dump does take a really long time, but (assuming it's allowed to run to completion) it'll finish well before July. Already it's estimating completion in May, so that's something. --Sapphic 04:21, 1 December 2008 (UTC) Reply
Oh, that's all right, then. </sarcasm>. The "long time" the full history dump is typically around six weeks, not the thick end of a year. It's very clear that something is very badly broken here. Alai 19:14, 7 January 2009 (UTC) Reply

ImportDump.php killed

Latest comment: 15 years ago 2 comments2 people in discussion

First, I am sorry for my English.

I try to import the dump of the ukrainian Wikipedia. After 5 minutes importing I recieve messege "Killed". I have changed file php.ini and set the following parameters:

upload_max_filesize = 20M

post_max_size = 20M

max_execution_time = 1000

max_input_time = 1000

But I still recieve the same messege "Killed" after 5 minutes importing. (it importing 8000 pages maximum) Support of the webhost provider have no ideas what is going wrong.

Please, help me! Thank you! --93.180.231.55 20:58, 26 December 2008 (UTC) Reply

This means that someone or something explicitely killed the process, probably because it consumed too much resources. Many shared hosting places, universities, etc, kill processes automatically after a couple of minutes. PLease talk to your local system admin. -- 81.163.107.36 10:31, 27 December 2008 (UTC) Reply

Stub dumps

Latest comment: 15 years ago 1 comment1 person in discussion

I just added a mention of the stub dumps, which I believe contains correct information. The stub dumps are useful for research purposes-- and MUCH easier to work with size-wise-- so I hope they will continue to be generated. 209.137.177.15 06:40, 3 March 2009 (UTC) Reply

Problem with split stub dumps ?

Latest comment: 15 years ago 2 comments2 people in discussion

I don't know if this is the right forum for this request, but frwiki, dewiki and even enwiki seem to repeatedly fail or take too long and get killed, apparently as a result of the long delay required for dumping « split stubs ». Would it be possible to reorder dumps so that key dumps like pages-articles.xml.bz2 would be dumped before these split stub dumps ? --66.131.214.76 21:49, 11 March 2009 (UTC) Laddo talk Reply

Any updates? answers? 87.68.112.255 08:20, 29 March 2009 (UTC) Reply

AltStyle によって変換されたページ (->オリジナル) /