NoSQL Databases do not organize data as tables linked by PK-FK relations. Instead, they use a different logical data model:
- Document Model
- Graph Model
- Key-value pair
Why NoSQL: prevalent use of small devices and distributed systems of apps requires (flexible data structure, big data, high demand for availability)
- scalable horizontally: no need to change data structure in order to add new data fields
- optimized for a specific data model with compromises:
- basic availability
- soft state
- eventual consistency
NoSQL Database Examples (MongoDB, DynamoDB, CouchDB, etc.)
CAP Theorem: Any distributed system can satisfy only two of the three properties: consistency, availability, and partition tolerance. This theorem classify NoSQL databases into three categories.
ACID Properties guarantee data validity despite errors and failures.
MongoDB Data Model:
- A database is a set of collections
- A collection is a set of documents
- document --> record, collection --> table,
- JSON: {...} for document, [...] for array or list of, X : Y for key-value pair
- JSON: allows embedded objects inside an object
- JSON: format for storing and communicating object data by JavaScript
- BSON (binary encoded JSON) includes type extension: _id field, and Date values
- MongoDB Data Types: string, integer, double, Date, Array, ObjectId, binary, Null
- MongoDB for large files (GridFS), server failures (replication), and big data I/O (sharding)
- GridFS: how to load large files into MongoDB
MongoDB Installation: follow the link https://www.ecourse.org/news.asp?which=5020 for Windows installation and https://www.ecourse.org/news.asp?which=5343 for Linux.
MongoDB Shell Access:
- How to logon to MongoDB
- How to create/remove user accounts, change passwords, and grant roles
- How to lookup databases, collections, and documents
- How to insert new documents
- How to remove documents, collections, and databases
- How to rename a database and a collection
Access to MongoDB Shell
The mongo shell is an interactive JavaScript interface to MongoDB. You can use the mongo shell to query and update data as well as perform administrative operations.
The mongo shell is included as part of the MongoDB Server installation. Once you have installed and have started MongoDB, connect the mongo shell to your running MongoDB instance.
Start the mongo Shell and Connect to MongoDB
Prerequisites
Ensure that MongoDB is running before attempting to start the mongo shell.
Open a terminal window (or a command prompt for Windows) and go to your <mongodb installation dir>/bin directory:
TIP: Adding your <mongodb installation dir>/bin to the PATH environment variable allows you to type mongo instead of having to go to the <mongodb installation dir>/bin directory or specify the full path to the binary.
Local MongoDB Instance on Default Port
You can run mongo shell without any command-line options to connect to a MongoDB instance running on your localhost with default port 27017:
Local MongoDB Instance on a Non-default Port
To explicitly specify the port, include the --port command-line option. For example, to connect to a MongoDB instance running on localhost with a non-default port 28015:
MongoDB Instance on a Remote Host
To explicitly specify the hostname and/or port,
-
You can specify a connection string. For example, to connect to a MongoDB instance running on a remote host machine:
-
You can use the command-line option --host . For example, to connect to a MongoDB instance running on a remote host machine:
-
You can use the --host and --port command-line options. For example, to connect to a MongoDB instance running on a remote host machine:
MongoDB Instance with Authentication
To connect to a MongoDB instance requires authentication:
-
You can specify the username, authentication database, and optionally the password in the connection string. For example, to connect and authenticate to a remote MongoDB instance as user alice :
-
You can also use the --username and --password , --authenticationDatabase command-line options. For example, to connect and authenticate to a remote MongoDB instance as user alice ( If you specify --password without the user’s password, the shell will prompt for the password):
Connect to a MongoDB Replica Set
To connect to a replica set:
-
You can specify the replica set name and members in the connection string.
-
If using the DNS Seedlist Connection Format, you can specify the connection string:
Note that use "+srv" connection string modifier automatically sets the ssl option to true for the connection.
-
You can specify the replica set name and members from the --host <replica setname>/:,:,... command-line option. For example, to connect to replica set named replA :
TLS/SSL Connection
For TLS/SSL connections,
Working with the mongo Shell
To list all databases available, use "show dbs" command. To display the database you are using, type db as command. The operation should return test , which is the default database. To switch databases, issue the use helper, as in the following example:
You can switch to non-existing databases. When you first store data in the database, such as by creating a collection, MongoDB creates the database. For example, the following creates both the databasemyNewDatabase and the collection myCollection during the insertOne() operation:
The db.myCollection.insertOne() is one of the methods available in the mongo shell.
db refers to the current database.
myCollection is the name of the collection.
If the mongo shell does not accept the name of a collection, you can use the alternative db.getCollection() syntax. For instance, if a collection name contains a space or hyphen, starts with a number, or conflicts with a built-in function:
The mongo shell prompt has a limit of 4095 codepoints for each line. If you enter a line with more than 4095 codepoints, the shell will truncate it.
Multi-line Operations in the mongo Shell
If you end a line with an open parenthesis ('(' ), an open brace ('{' ), or an open bracket ('[' ), then the subsequent lines start with ellipsis ("..." ) until you enter the corresponding closing parenthesis (')' ), the closing brace ('}' ) or the closing bracket (']' ). The mongo shell waits for the closing parenthesis, closing brace, or the closing bracket before evaluating the code, as in the following example:
You can exit the line continuation mode if you enter two blank lines, as in the following example:
Tab Completion and Other Keyboard Shortcuts
The mongo shell supports keyboard shortcuts. For example,
-
Use the up/down arrow keys to scroll through command history. See .dbshell documentation for more information on the .dbshell file.
-
Use to autocomplete or to list the completion possibilities, as in the following example which uses to complete the method name starting with the letter 'c' :
Because there are many collection methods starting with the letter 'c' , the will list the various methods that start with 'c' .
.mongorc.js File
When starting, mongo checks the user’s HOME directory for a JavaScript file named .mongorc.js. If found, mongo interprets the content of .mongorc.js before displaying the prompt for the first time. If you use the shell to evaluate a JavaScript file or expression, either by using the --eval option on the command line or by specifying a .js file to mongo, mongo will read the .mongorc.js file after the JavaScript has finished processing. You can prevent .mongorc.js from being loaded by using the --norc option.
Exit the Shell
To exit the shell, type quit() or use the shortcut.
Basic MongoDB Shell Operations
Login to mongodb shell
ubuntu@ip-10-0-1-223:~$ mongo 10.0.1.223
MongoDB shell version: 2.4.9
connecting to: 10.0.1.223/test
Create your database by inserting items
> db.penguins.insert({"penguin": "skipper"})
> db.penguins.insert({"penguin": "kowalski"})
Check if it is there:
> show dbs
local 0.078125GB
penguins 0.203125GB
Lets make that database the one we are on now
> use penguins
switched to db penguins
Print json:
> printjson({"foo":"bar"})
{ "foo" : "bar" }
Get the rows back:
> db.penguins.find()
{ "_id" : ObjectId("5498da1bf83a61f58ef6c6d5"), "penguin" : "skipper" }
{ "_id" : ObjectId("5498da28f83a61f58ef6c6d6"), "penguin" : "kowalski" }
We only want to find one row
> db.penguins.findOne()
{ "_id" : ObjectId("5498da1bf83a61f58ef6c6d5"), "penguin" : "skipper" }
Get the _id of that row:
> db.penguins.findOne()._id
ObjectId("5498da1bf83a61f58ef6c6d5")
Example loop, print strings:
> db.penguins.find().forEach(function (doc){ print("hi") })
hi
hi
Example loop, same as find(), print the rows
> db.penguins.find().forEach(function (doc){ printjson(doc) })
{ "_id" : ObjectId("5498dbc9f83a61f58ef6c6d7"), "penguin" : "skipper" }
{ "_id" : ObjectId("5498dbd5f83a61f58ef6c6d8"), "penguin" : "kowalski" }
Drop the database when you are done:
> use penguins
switched to db penguins
> db.dropDatabase()
{ "dropped" : "penguins", "ok" : 1 }
Make sure it's gone:
> show dbs
local 0.078125GB
test (empty)
How to Format Date/Time Values
There are two ways to insert a date value into a MongoDB document: use function ISODate() function and Date() constructor in JavaScript.
Here is an example document to use ISODate() function:
{
sid: 1234567,
sname: "John Doe",
date: ISODate('2001-12-31T12:01:15.123Z')
}
Here is an example document of using Date() constructor:
{
sid: 1234567,
sname: "John Doe",
date: new Date(2001, 12, 31, 12, 01)
}
Note that, new Date() will give the current system date and time if no specific year, month, day, etc. are specified.
In order to insert Date() in MongoDB through Mongo shell, use the following syntax
var yourVariableName = new Date(year, month, day, hour, minute);
db.yourCollectionName({yourDateFieldName:yourVariableName});
Let us first create a date variable
> var creatingDate = new Date(2019, 03, 29, 13, 12);
Let us create a collection with documents:
>db.insertingDateUsingVariableDemo.insertOne({"UserName":"John","UserMessages":["Hi","Hello","Awesome"],"UserPostDate":creatingDate});
Following is the query to display all documents from a collection with the help of find() method
> db.insertingDateUsingVariableDemo.find().pretty();
This will produce the following output
{
"_id" : ObjectId("5c9d1b19a629b87623db1b21"),
"UserName" : "John",
"UserMessages" : [
"Hi",
"Hello",
"Awesome"
],
"UserPostDate" : ISODate("2019-04-29T07:42:00Z")
}
Document Timestamps: Although MongoDB uses 64-bit for timestamps, but only the first 32-bit are essential. Thus, the ObjectID of each document consists of 4-byte timestamp, 5-byte random number, and 3-byte serial number. To get the timestamp of a document, use ObjectID.getTimestamp() function.
Get an ISODate:
> ISODate("2013-03-01")
ISODate("2013-03-01T00:00:00Z")
Get the timestamp from the _id object:
> db.penguins.findOne()._id.getTimestamp()
ISODate("2014-12-23T02:57:31Z")
Get the timestamp of the last added record:
> db.penguins.find().sort({_id:-1}).limit(1).forEach(function (doc){ print(doc._id.getTimestamp()) })
Tue Dec 23 2014 03:04:53 GMT+0000 (UTC)
Loop, get the system date:
> db.penguins.find().forEach(function (doc){ doc["timestamp_field"] = new Date(); printjson(doc); })
{
"_id" : ObjectId("5498dbc9f83a61f58ef6c6d7"),
"penguin" : "skipper",
"timestamp_field" : ISODate("2014-12-23T03:15:56.257Z")
}
{
"_id" : ObjectId("5498dbd5f83a61f58ef6c6d8"),
"penguin" : "kowalski",
"timestamp_field" : ISODate("2014-12-23T03:15:56.258Z")
}
Loop, get the date of each row:
> db.penguins.find().forEach(function (doc){ doc["timestamp_field"] = doc._id.getTimestamp(); printjson(doc); })
{
"_id" : ObjectId("5498dbc9f83a61f58ef6c6d7"),
"penguin" : "skipper",
"timestamp_field" : ISODate("2014-12-23T03:04:41Z")
}
{
"_id" : ObjectId("5498dbd5f83a61f58ef6c6d8"),
"penguin" : "kowalski",
"timestamp_field" : ISODate("2014-12-23T03:04:53Z")
}
Filter down to just the dates
> db.penguins.find().forEach(function (doc){ doc["timestamp_field"] = doc._id.getTimestamp(); printjson(doc["timestamp_field"]); })
ISODate("2014-12-23T03:04:41Z")
ISODate("2014-12-23T03:04:53Z")
Filter down further for just the strings:
> db.penguins.find().forEach(function (doc){ doc["timestamp_field"] = doc._id.getTimestamp(); print(doc["timestamp_field"]) })
Tue Dec 23 2014 03:04:41 GMT+0000 (UTC)
Tue Dec 23 2014 03:04:53 GMT+0000 (UTC)
Print a bare date, get its type, assign a date:
> print(new Date())
Tue Dec 23 2014 03:30:49 GMT+0000 (UTC)
> typeof new Date()
object
> new Date("11/21/2012");
ISODate("2012-11-21T00:00:00Z")
Convert instance of date to yyyy-MM-dd
> print(d.getFullYear()+"-"+(d.getMonth()+1)+"-"+d.getDate())
2014-1-1
get it in yyyy-MM-dd format for each row:
> db.penguins.find().forEach(function (doc){ d = doc._id.getTimestamp(); print(d.getFullYear()+"-"+(d.getMonth()+1)+"-"+d.getDate()) })
2014-12-23
2014-12-23
the toLocaleDateString is briefer:
> db.penguins.find().forEach(function (doc){ d = doc._id.getTimestamp(); print(d.toLocaleDateString()) })
Tuesday, December 23, 2014
Tuesday, December 23, 2014
Get each row in yyyy-MM-dd HH:mm:ss format:
> db.penguins.find().forEach(function (doc){ d = doc._id.getTimestamp(); print(d.getFullYear()+"-"+(d.getMonth()+1)+"-"+d.getDate() + " " + d.getHours() + ":" + d.getMinutes() + ":" + d.getSeconds()) })
2014-12-23 3:4:41
2014-12-23 3:4:53
Get the date of the last added row:
> db.penguins.find().sort({_id:-1}).limit(1).forEach(function (doc){ print(doc._id.getTimestamp()) })
Tue Dec 23 2014 03:04:53 GMT+0000 (UTC)
GridFS
GridFS is the MongoDB specification for storing and retrieving large files such as images, audio files, video files, etc. It is kind of a file system to store files but its data is stored within MongoDB collections. GridFS has the capability to store files even greater than its document size limit of 16MB.
GridFS divides a file into chunks and stores each chunk of data in a separate document, each of maximum size 255k.
GridFS by default uses two collections fs.files and fs.chunks to store the file's metadata and the chunks. Each chunk is identified by its unique _id ObjectId field. The fs.files serves as a parent document. The files_id field in the fs.chunks document links the chunk to its parent.
Following is a sample document of fs.files collection −
{
"filename": "test.txt",
"chunkSize": NumberInt(261120),
"uploadDate": ISODate("2014-04-13T11:32:33.557Z"),
"md5": "7b762939321e146569b07f72c62cca4f",
"length": NumberInt(646)
}
The document specifies the file name, chunk size, uploaded date, and length.
Following is a sample document of fs.chunks document −
{
"files_id": ObjectId("534a75d19f54bfec8a2fe44b"),
"n": NumberInt(0),
"data": "Mongo Binary Data"
}
Adding Files to GridFS
Now, we will store an mp3 file using GridFS using the put command. For this, we will use the mongofiles.exe utility present in the bin folder of the MongoDB installation folder.
Open your command prompt, navigate to the mongofiles.exe in the bin folder of MongoDB installation folder and type the following code −
>mongofiles.exe -d gridfs put song.mp3
Here, gridfs is the name of the database in which the file will be stored. If the database is not present, MongoDB will automatically create a new document on the fly. Song.mp3 is the name of the file uploaded. To see the file's document in database, you can use find query −
>db.fs.files.find()
The above command returned the following document −
{
_id: ObjectId('534a811bf8b4aa4d33fdf94d'),
filename: "song.mp3",
chunkSize: 261120,
uploadDate: new Date(1397391643474), md5: "e4f53379c909f7bed2e9d631e15c1c41",
length: 10401959
}
We can also see all the chunks present in fs.chunks collection related to the stored file with the following code, using the document id returned in the previous query −
>db.fs.chunks.find({files_id:ObjectId('534a811bf8b4aa4d33fdf94d')})
In my case, the query returned 40 documents meaning that the whole mp3 document was divided in 40 chunks of data.
Homework:
Reading: Model document relations in MongoDB (https://www.ecourse.org/news.asp?which=5342)
Correctness Questions: online at ecourse.org
Hands-on Questions (copy both the question number and the question before each answer):
1. Follow the reading materials and write scripts to create collections for the data in the following relational model:
2. Find the document with ename as JONES
3. For all the documents inserted by Question 1, find their timestamps
4. Find the last three employees documents inserted in Question 1
5. Find the ObjectID of all the documents in the employees collection created in Question 1
6. Find the ename, hiredate, and job fields of all the documents in the employees collection in Question 1
7. Find the top three salary makers in the employee collection
8. Find the first five employees hired by the company
9. Create a new user account with readWrite permission for your database
10. Find the last document you inserted into the employee collection
|