New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill Graphpass on graphs with greater than 40,000 nodes (Issue #22) #23

Merged
merged 3 commits into from Jun 12, 2018

Conversation

Projects
None yet
3 participants
@greebie
Collaborator

greebie commented Jun 8, 2018

Upgrade gexf output to 1.3.
Kill graphpass on nodes > MAX_NODES (currently set to 40,000).

After doing some preliminary testing, the problem with large graphs sits with the walktrap modularity and fruchterman rheinhold algorithms.

For now we are going to cut off analysis for any graph with more than 40,000 nodes.

Note: graphpass will load the graph into memory and then quit (freeing the memory) if there are more than 40,000 nodes. It's possible also to check on file size instead, although igraph will quit if the load causes it to run out of memory.

Longer term solution is to try snap to see if we can deal with larger graphs that way.

To test:

./graphpass -f {FILENAME}.graphml where FILENAME is a graph file with greater than 40,000 nodes.

Upgrade gexf output to 1.3.
Kill graphpass on nodes > MAX_NODES (currently set to 40,000).
@greebie

This comment has been minimized.

Show comment
Hide comment
@greebie

greebie Jun 8, 2018

Collaborator

Hi @ianmilligan1 I wonder if you can test that this works for you and then merge if you think it's working. Thanks.

Collaborator

greebie commented Jun 8, 2018

Hi @ianmilligan1 I wonder if you can test that this works for you and then merge if you think it's working. Thanks.

Show outdated Hide outdated src/graphpass.c
@ruebot

This comment has been minimized.

Show comment
Hide comment
@ruebot

ruebot Jun 11, 2018

Member
$ ./graphpass --file 5562-gephi.graphml --output /home/nruest/Dropbox --dir /home/nruest/Dropbox -g -q 
>>>>>>>  GRAPHPASSING >>>>>>>> 
DIRECTORY: /home/nruest/Dropbox 
STRLEN PATH: 20 
OUTPUT DIRECTORY: /home/nruest/Dropbox
PERCENTAGE: 0.000000
FILE: 5562-gephi.graphml
METHODS STRING: d
QUICKRUN: 1
REPORT: 0
SAVE: 1
Running graphpass on file: /home/nruest/Dropbox/5562-gephi.graphml
Successfully ingested graph with 324009 nodes.
Graphpass can only conduct analysis on graphs with less than 40000 nodes.
Exiting...
Member

ruebot commented Jun 11, 2018

$ ./graphpass --file 5562-gephi.graphml --output /home/nruest/Dropbox --dir /home/nruest/Dropbox -g -q 
>>>>>>>  GRAPHPASSING >>>>>>>> 
DIRECTORY: /home/nruest/Dropbox 
STRLEN PATH: 20 
OUTPUT DIRECTORY: /home/nruest/Dropbox
PERCENTAGE: 0.000000
FILE: 5562-gephi.graphml
METHODS STRING: d
QUICKRUN: 1
REPORT: 0
SAVE: 1
Running graphpass on file: /home/nruest/Dropbox/5562-gephi.graphml
Successfully ingested graph with 324009 nodes.
Graphpass can only conduct analysis on graphs with less than 40000 nodes.
Exiting...
@greebie

This comment has been minimized.

Show comment
Hide comment
@greebie

greebie Jun 11, 2018

Collaborator

Suggest we wait on details from 50k run before merge. I've also confirmed that the main chunks of memory usage are from the modularity calculation and the fruchtermann rheingold algorithm for node positioning.

Collaborator

greebie commented Jun 11, 2018

Suggest we wait on details from 50k run before merge. I've also confirmed that the main chunks of memory usage are from the modularity calculation and the fruchtermann rheingold algorithm for node positioning.

@ianmilligan1

Once you update this, Ryan, I’ll do one last local test and I think it’s good to merge?

Show outdated Hide outdated headers/graphpass.h
Create verbose mode (-v or --verbose).
Set MAX_NODES to 50000.
"Less" should be "Fewer" in error message.
@greebie

This comment has been minimized.

Show comment
Hide comment
@greebie

greebie Jun 12, 2018

Collaborator

Added a verbose mode based on comments by @ruebot. Now if you want all the details in stdout you add a -v or --verbose. The print-outs are mostly for testing anyway.

It will still output messages on a FAIL situation however.

Collaborator

greebie commented Jun 12, 2018

Added a verbose mode based on comments by @ruebot. Now if you want all the details in stdout you add a -v or --verbose. The print-outs are mostly for testing anyway.

It will still output messages on a FAIL situation however.

@ianmilligan1 ianmilligan1 merged commit d17b57c into master Jun 12, 2018

@ianmilligan1 ianmilligan1 deleted the Issue-22 branch Jun 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment