Research:Surviving new editor
- {\displaystyle n} = 1 edit
- {\displaystyle m} = 1 edit
- {\displaystyle t_{1}} = 1 day
- {\displaystyle t_{2}} = 30 days (~ one month)
- {\displaystyle t_{3}} = 30 days (~ one month)
SET@activation_period=1;/* One day */ SET@n=1;/* One activation edit */ SET@trial_period=30;/* 30 days */ SET@survival_period=30;/* 30 days*/ SET@m=1;/* One survival edit */ SET@start_date="20140101";/* January 1st, 2014 after midnight */ SET@end_date="20140201";/* February 1st, 2014 before midnight */ SELECT user_id, user_name, user_registration, SUM(activation_edits)>@nASactivated, SUM(activation_edits)>@nANDSUM(surviving_edits)>@mASsurviving, ( UNIX_TIMESTAMP(NOW())< UNIX_TIMESTAMP(DATE_ADD(user_registration,INTERVAL@trial_period+@survival_periodDAY)) )AScensored FROM( SELECT user_id, user_name, user_registration, SUM( rev_timestampBETWEEN user_registrationAND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@activation_periodDAY),"%Y%m%d%H%i%M") )ASactivation_edits, SUM( rev_timestampBETWEEN DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_periodDAY),"%Y%m%d%H%i%M")AND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_period+@survival_periodDAY),"%Y%m%d%H%i%M") )ASsurviving_edits FROMuser LEFTJOINrevisionON user_id=rev_userAND ( rev_timestampBETWEEN user_registrationAND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@activation_periodDAY),"%Y%m%d%H%i%M")OR rev_timestampBETWEEN DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_periodDAY),"%Y%m%d%H%i%M")AND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_period+@survival_periodDAY),"%Y%m%d%H%i%M") ) WHEREuser_registrationBETWEEN@start_dateAND@end_date UNIONALL SELECT user_id, user_name, user_registration, SUM( ar_timestampBETWEEN user_registrationAND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@activation_periodDAY),"%Y%m%d%H%i%M") )ASactivation_edits, SUM( ar_timestampBETWEEN DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_periodDAY),"%Y%m%d%H%i%M")AND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_period+@survival_periodDAY),"%Y%m%d%H%i%M") )ASsurviving_edits FROMuser LEFTJOINarchiveON user_id=ar_userAND ( ar_timestampBETWEEN user_registrationAND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@activation_periodDAY),"%Y%m%d%H%i%M")OR ar_timestampBETWEEN DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_periodDAY),"%Y%m%d%H%i%M")AND DATE_FORMAT(DATE_ADD(user_registration,INTERVAL@trial_period+@survival_periodDAY),"%Y%m%d%H%i%M") ) WHEREuser_registrationBETWEEN@start_dateAND@end_date )split_edit_counts GROUPBYuser_id,user_name,user_registration;
Surviving new editor is a standardized user class used to measure the number of first-time editors in a wiki project who continue to edit for a substantial period of time. It's used as a proxy for editor retention.
Discussion
[edit ]The {\displaystyle t_{1}} activation period
[edit ]The activation period selects users whose retention needs to be measured:
- setting {\displaystyle t_{1}=0} measures the retention (or rather a delayed activation) of newly registered users, regardless of when they started editing.
- by setting {\displaystyle t_{1}>0} to a value other than 0 we restrict the measurement of retention to a subset of users who edited within a given activation period since registration
- by setting {\displaystyle t_{1}=1} we measure the retention of new editors, based on the proposed definition of a new editor: when we do so, we effectively consider surviving new editors as a proper subset of new editors.
The {\displaystyle t_{2}} trial period
[edit ]During the trial period, new editors are presumed to be testing out Wikipedia and Wikipedians are testing out the editor. This is the time when non-retained editors tend to leave Wikipedia and when retained editors decide to stick around. The longer the duration of this period, the longer an editor will need to remain active in order to be counted.
The {\displaystyle t_{3}} survival period
[edit ]During the survival period, new editors who are retained are expected to show some activity to indicate their survival. The longer the duration of the survival period, the more likely we are to notice some activity from editors who are less consistently active. Longer survival periods are also likely to catch users who left Wikipedia reactivating their accounts.
Analysis
[edit ]Wikis
[edit ]German
[edit ]English
[edit ]Sensitivity
[edit ]Trial period duration
[edit ]Figure #Trial period factor plots the factor relationship between the # of users who edit after 3 months (horizontal line at {\displaystyle 1}) and the number users who edit after 1, 2, 4, 5 and 6 months. It looks like both enwiki and dewiki have a bit of trend where the number of users surviving for 1 or 2 trial months in relation to 3 or more is changing. This is not extreme and therefore might not matter. But it does suggest that even users who survive 1-2 months are getting less likely to survive 3.
Survival period duration
[edit ]Figure #Survival period factor plots the factor relationship between the # of users who edit within a 3 month window (horizontal line at {\displaystyle 1}) and the number users who edit within 1, 2, 4, 5 and 6 month windows. For the survival period duration, we don't see any meaningful change over time.
Usage
[edit ]- New editors' first session and retention -- Used to compare the survival of new editors over time and as the dependent variable in a logistic regression.
- Research:Teahouse long term new editor retention