top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

How to improve MongoDB aggregation performance

+1 vote
484 views

I am having collection with 127706 document. In aggregation pipeline i having 2 group stages. It is giving me result in 1.5 sec.

To optimize it to more I have created index on the fields which I am using in match stages with no success. Is their any to optimize aggregation performance in more way?

I m using mongodb 3.2.1?

posted Mar 2, 2017 by anonymous

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button

Similar Questions
0 votes

We are using git-1.8.2 version for version control. It is an centralized server and git status takes too long

How to improve the performance of git status

Git repo details:
Size of the .git folder is 8.9MB
Number of commits approx 53838 (git rev-list HEAD --count)
Number of branches - 330
Number of files - 63883
Working tree clone size is 4.3GB

time git status shows
real 0m23.673s
user 0m9.432s
sys 0m3.793s

then after 5 mins
real 0m4.864s
user 0m1.417s
sys 0m4.710s

And I have experimented the following ways
- - Setting core.ignorestat to true
- - Git gc &git clean
- - Shallow clone €“ Reducing number of commits
- - Clone only one branch
- Git repacking - git repack -ad && git prune
- - Cold/warm cache

Could you please let me know, what are the ways to improve the git performance ?

0 votes

I have problem with mongods high cpu usage. We get 3000 data per second, and I insert it per 100,000.(insert with c++ driver, using vector)

I expect even progress of cpu, but I intermittently found high cpu of mongod. Although there is no result of currentOp() command. Why cpu status shows like below?

Insert amount of 459~742 second is not bigger than before, but mongods cpu usage is much bigger.

What can be cause of this status?

0 votes

I am new to mongodb. I am trying to do some aggregation operations like sum, avg, min.. on a collection. And I found that I can do it either using aggregation framework or cursor.forEach(). Which one to use? It will be better if someone explains how both works internally and give me some suggestions.

Thank you in advance

+1 vote

Below table contains billion of rows,

CREATE TABLE `Sample1` (
  `c1` bigint(20) NOT NULL AUTO_INCREMENT,
  `c2` varchar(45) NOT NULL,
  `c3` tinyint(4) DEFAULT NULL,
  `c4` tinyint(4) DEFAULT NULL,
  `time` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`c1`),
  KEY `varchar_time_idx` (`c2`,`Time`),
  KEY `varchar_c3_time_idx` (`c2`,`c3`,`Time`),
  KEY `varchar_c4_time_idx` (`c2`,`c4`,`Time`),
  KEY `varchar_c3_c4_time_idx` (`c2`,'c3', `c4`,`Time`),
) ENGINE=InnoDB AUTO_INCREMENT=10093495 DEFAULT CHARSET=utf8

Four multi column index created because having below conditions in where

1) c2 and time
2) c2 and c3 and time
3) c2 and c4 and time
4) c2 and c3 and c4 and time

Cardinality wise c2, c3 and c4 are very low. (ex: Out of one million c2, c3 and c4 have 50 unique column in each)

Column time contains mostly unique fields.

Select, insert and update happened frequently.

Tables has 5 indexing fields(4 multi column). Due to this, 1) Insert and update on index fields become costlier. 2) As the table keep on growing (Nearly one billion rows), Index size also increase more rapidly.

Kindly suggest good approach in mysql to solve this use case.

...