tidyjson

CRAN_Status_Badge Build Status Coverage Status

CRAN Activity CRAN History

tidyjson graphs

tidyjson provides tools for turning complex json into tidy data.

Installation

Get the released version from CRAN:

 install.packages("tidyjson")

or the development version from github:

devtools::install_github("colearendt/tidyjson")

Examples

The following example takes a character vector of 500 documents in the worldbank dataset and spreads out all objects.
Every JSON object key gets its own column with types inferred, so long as the key does not represent an array. When recursive=TRUE (the default behavior), spread_all does this recursively for nested objects and creates column names using the sep parameter (i.e. {"a":{"b":1}} with sep='.' would generate a single column: a.b).

 library(dplyr)
 library(tidyjson)
 
worldbank %>% spread_all
 #> # A tbl_json: 500 x 9 tibble with a "JSON" attribute
 #> ..JSON docum...1 board...2 closi...3 count...4 proje...5 regio...6 total...7 _id.$...8
 #> <chr> <int> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> 
 #> 1 "{\"_id\":{\... 1 2013-1... 2018-0... Ethiop... Ethiop... Africa 1.3 e8 52b213...
 #> 2 "{\"_id\":{\... 2 2013-1... <NA> Tunisia TN: DT... Middle... 0 52b213...
 #> 3 "{\"_id\":{\... 3 2013-1... <NA> Tuvalu Tuvalu... East A... 6.06e6 52b213...
 #> 4 "{\"_id\":{\... 4 2013-1... <NA> Yemen,... Gov't ... Middle... 0 52b213...
 #> 5 "{\"_id\":{\... 5 2013-1... 2019-0... Lesotho Second... Africa 1.31e7 52b213...
 #> 6 "{\"_id\":{\... 6 2013-1... <NA> Kenya Additi... Africa 1 e7 52b213...
 #> 7 "{\"_id\":{\... 7 2013-1... 2019-0... India Nation... South ... 5 e8 52b213...
 #> 8 "{\"_id\":{\... 8 2013-1... <NA> China China ... East A... 0 52b213...
 #> 9 "{\"_id\":{\... 9 2013-1... 2018-1... India Rajast... South ... 1.6 e8 52b213...
 #> 10 "{\"_id\":{\... 10 2013-1... 2014-1... Morocco MA Acc... Middle... 2 e8 52b213...
 #> # ... with 490 more rows, and abbreviated variable names 1​document.id,
 #> # 2​boardapprovaldate, 3​closingdate, 4​countryshortname, 5​project_name,
 #> # 6​regionname, 7​totalamt, 8​`_id.$oid`

Some objects in worldbank are arrays, which are not handled by spread_all. This example shows how to quickly summarize the top level structure of a JSON collection

worldbank %>% gather_object %>% json_types %>% count(name, type)
 #> # A tibble: 8 ×ばつ 3
 #> name type n
 #> <chr> <fct> <int>
 #> 1 _id object 500
 #> 2 boardapprovaldate string 500
 #> 3 closingdate string 370
 #> 4 countryshortname string 500
 #> 5 majorsector_percent array 500
 #> 6 project_name string 500
 #> 7 regionname string 500
 #> 8 totalamt number 500

In order to capture the data in the majorsector_percent array, we can use enter_object to enter into that object, gather_array to stack the array and spread_all to capture the object items under the array.

worldbank %>%
 enter_object(majorsector_percent) %>%
 gather_array %>%
 spread_all %>%
 select(-document.id, -array.index)
 #> # A tbl_json: 1,405 x 3 tibble with a "JSON" attribute
 #> ..JSON Name Percent
 #> <chr> <chr> <dbl>
 #> 1 "{\"Name\":\"Educat..." Education 46
 #> 2 "{\"Name\":\"Educat..." Education 26
 #> 3 "{\"Name\":\"Public..." Public Administration, Law, and Justice 16
 #> 4 "{\"Name\":\"Educat..." Education 12
 #> 5 "{\"Name\":\"Public..." Public Administration, Law, and Justice 70
 #> 6 "{\"Name\":\"Public..." Public Administration, Law, and Justice 30
 #> 7 "{\"Name\":\"Transp..." Transportation 100
 #> 8 "{\"Name\":\"Health..." Health and other social services 100
 #> 9 "{\"Name\":\"Indust..." Industry and trade 50
 #> 10 "{\"Name\":\"Indust..." Industry and trade 40
 #> # ... with 1,395 more rows

API

Spreading objects into columns

Object navigation

Array navigation

JSON inspection

JSON summarization

Creating tbl_json objects

Converting tbl_json objects

Included JSON data

Philosophy

The goal is to turn complex JSON data, which is often represented as nested lists, into tidy data frames that can be more easily manipulated.

Tidyjson depends upon

Further, there are other R packages that can be used to better understand JSON data

AltStyle によって変換されたページ (->オリジナル) /