I'm new to programming and mongoDB and learning as I go, I'm attempting a mapreduce on a dataset using mongoDB. So far I've converted the csv to json and imported it into a mongoDB using compass.
In compass the data now looks like this :
_id :5bc4e11789f799178470be53
slug :"bitcoin"
symbol :"BTC"
name :"Bitcoin"
date :"2013-04-28"
ranknow :"1"
open :"135.3"
high :"135.98"
low :"132.1"
close :"134.21"
volume :"0"
market :"1500520000"
close_ratio :"0.5438"
spread :"3.88"
I've added each value as indices as follows, is this the right process so I can run a mapreduce against the data ?
db.testmyCrypto.getIndices() [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "id", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "slug" : 1 }, "name" : "slug_1", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "symbol" : 2 }, "name" : "symbol_2", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "name" : 3 }, "name" : "name_3", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "data" : 4 }, "name" : "data_4", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "ranknow" : 4 }, "name" : "ranknow_4", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "ranknow" : 5 }, "name" : "ranknow_5", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "open" : 6 }, "name" : "open_6", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "high" : 7 }, "name" : "high_7", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "low" : 8 }, "name" : "low_8", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "volume" : 9 }, "name" : "volume_9", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "market" : 10 }, "name" : "market_10", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "close_ratio" : 11 }, "name" : "close_ratio_11", "ns" : "myCrypto.testmyCrypto" }, { "v" : 2, "key" : { "spread" : 13 }, "name" : "spread_13", "ns" : "myCrypto.testmyCrypto" } ]
I've scraped the above and now im doing the following from the link to the map-reduce. Is this the correct output, someone ?
> db.testmyCrypto.mapReduce(function() { emit( this.slug, this.symbol ); }, function(key, values) { return Array.sum( values ) },
... {
... query: { date:"2013-04-28" },
... out: "Date 04-28"
... }
... )
{
"result" : "Date 04-28",
"timeMillis" : 837,
"counts" : {
"input" : 0,
"emit" : 0,
"reduce" : 0,
"output" : 0
},
"ok" : 1
}
I've added the "key value pairs" but I don't seem to be able to get anything from the data.
> db.testmyCrypto.mapReduce(function() { emit( this.slug, this.symbol, this.name, this.date, this.ranknow, this.open, this.high, this.low, this.close, this.volume, this.market, this.close_ratio, this.spread ); }, function(key, values) { return Array.sum( values ) }, { query: { slug:"bitcoin" }, out: "Date 04-28" } )
{ "result" : "Date 04-28", "timeMillis" : 816,
"counts" : { "input" : 0, "emit" : 0, "reduce" : 0, "output" : 0 }, "ok" : 1 }
>
1 Answer 1
I've added each value as indices as follows, is this the right process so I can run a mapreduce against the data ?
For MapReduce an index is only useful if supporting your query
criteria. It looks likely that you have many more indices than would be useful and are missing an index for your example query on date
.
im doing the following from the link to the map-reduce. Is this the correct output
db.testmyCrypto.mapReduce(function() { emit( this.slug, this.symbol ); }, function(key, values) { return Array.sum( values ) },
The example here doesn't make sense since you are trying to do an arithmetic sum of the symbol
string value. Your original values also appear to be strings rather than numbers, which is not ideal.
When importing your JSON you should look into converting into appropriate data types such as number and date. mongoimport
has a --columnsHaveTypes
option that lets you cast values to appropriate types when importing.
Recommended approach
I think what you are trying to do is calculate some values for a given cryptocurrency symbol. Where possible you should use the Aggregation Framework as it will be more efficient (and easier to troubleshoot) than MapReduce.
Since you mentioned you are using MongoDB Compass, you can use the Aggregation Pipeline Builder in Compass 1.14+ to interactively build an aggregation.
For example, using MongoDB 4.0 and downloading the same data set:
Add column type hints to the header line in CSV using command line tools (could also be done using a text editor):
headerline="slug.string(),symbol.string(),name.string(),date.date("2006-01-02"),ranknow.int32(),open.decimal(),high.decimal(),low.decimal(),close.decimal(),volume.decimal(),market.decimal(),close_ratio.decimal(),spread.decimal()" sed "1s/.*/$headerline/" crypto-markets.csv > testMyCrypto.csv
Import the data using
mongoimport
:mongoimport --headerline --columnsHaveTypes --type csv testmyCrypto.csv --db test --collection testmyCrypto --drop
Using the
mongo
shell (or Compass UI), add an index to support efficient querying by date:db.testmyCrypto.createIndex({'date':1})
Run an aggregation query to add a calculated
daily_change
field with the difference inopen
andclose
prices for each symbol on a given date:db.testmyCrypto.aggregate([ { $match: { date: new Date("2013-04-28") }}, { $project: { _id: 0, symbol: "$symbol", name: "$name", daily_change: { $subtract: [ "$open", "$close" ] } }} ]).pretty()
Expected results:
{ "symbol" : "BTC", "name" : "Bitcoin", "daily_change" : NumberDecimal("1.09") } { "symbol" : "LTC", "name" : "Litecoin", "daily_change" : NumberDecimal("-0.05") } { "symbol" : "PPC", "name" : "Peercoin", "daily_change" : NumberDecimal("0.000433") } { "symbol" : "NMC", "name" : "Namecoin", "daily_change" : NumberDecimal("-0.01") } { "symbol" : "NVC", "name" : "Novacoin", "daily_change" : NumberDecimal("-0.03") } { "symbol" : "TRC", "name" : "Terracoin", "daily_change" : NumberDecimal("0.003901") }