-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix: handle self-loops (more efficiently) #1231
Open
imri
wants to merge
52
commits into
thinkaurelius:titan10
Choose a base branch
from
imri:self-loops-more-efficient-bugfix
base: titan10
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Bugfix: handle self-loops (more efficiently) #1231
imri
wants to merge
52
commits into
thinkaurelius:titan10
from
imri:self-loops-more-efficient-bugfix
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Three tests not passing right now.
Conflicts: pom.xml
… suite. These tests were related to consistency of behavior around deleted elements. This proved a bit hard to enforce properly on the TinkerPop side and so this requirement was removed for 3.1.x.
Currently opted to reference the shaded jackson classes rather than depend on jackson directly in titan. Seems like it is better to handle it that way as it prevents jackson version divergence.
Conflicts: pom.xml
TP3 at 3.1.0 no longer supports hadoop 1.x. It supports 2.x. Had to make a package name change and specifically include spark-gremlin. No guarantees that what i've done packages things properly as I had to make some exclusions given hadoop 1.x dependencies. There is likely a bigger set of chagnes to the project structure and poms that what was done here to just get this to build.
…pdate. Added a test case to make sure that TitanGraphStep folds in HasContainers even in mid-traversal (@dalaro -- I don't know where this should have gone. So please feel free to put somewhere else if you want. I put it in the standard pattern we use in TinkerPop). Added GraphComputer.config(key,value) as a default method to TitanGraphComputer. It simply does nothing. I don't know if you guys want to have it set configurations. If you do, please do what is appropriate instead of it just doing nothing. Again, that was a TinkerPop3.1 update.
…kwards compatible. Given that Titan just returned this, removed the method.
Upgrade tinkerpop 3.1.0-incubating
Assertions after calling clopen() in BerkeleyElasticsearchTest have been failing since the upgrade from TP 3.0 to 3.1. I can't tell precisely when it broke, since many of the recent predecessor commits that bisect would touch have compile errors. However, squinting at the failure stack traces and playing with tweaks eventually pointed to a change in the default on-close behavior of transactions between TP 3.0 and 3.1. TitanIndexTest relies on commit. However, this was changed to rollback in 3.1: http://tinkerpop.apache.org/docs/3.1.0-incubating/upgrade.html#_transaction_close_default_behavior I overrode TitanIndexTest's clopen() to commit the graph's transaction before invoking super.clopen().
remove rexster dir and reference in pom.xml
Didn't do too much replacement with "Gremlin Server" stuff in the debian packaging stuff as it is generally out of date and expected to see removal as a whole.
…into pluradj-compaction_strategy
This changes the default behavior of thinkaurelius#1156. Instead of hardcoding SizeTieredCompactionStrategy as the default and applying it in the absence of an explicit user override, this commit removes the default value for COMPACTION_STRATEGY. It also updates the three use sites to check whether the config contains COMPACTION_STRATEGY before attempting to read the option's value. Similarly, if COMPACTION_OPTIONS was not set, or if it was set to an empty map, then its use sites will not call the associated builder method, allowing the implementation to use whatever default it might have (probably empty anyway).
This upgrades Jackson from 2.3.0 to 2.4.4 (for compatibility with the version of Spark preferred by TP 3.1). This downgrades Jersey from 1.18.2 to 1.9 to match Gremlin-Server. This is probably going to break something else. Independent of the version changes, this commit also puts jersey-client and jersey-guice under version management. Before, they were unmanaged, so they could get out of sync with the versions of the rest of the (managed) jersey artifacts. Hence jersey-client mismatched to jersey-server, etc.
Conflicts: pom.xml titan-core/src/main/java/com/thinkaurelius/titan/core/attribute/Geoshape.java titan-core/src/main/java/com/thinkaurelius/titan/graphdb/tinkerpop/io/graphson/TitanGraphSONModule.java
TP 3.1 changed the default tx.close() behavior from commit to rollback. This broke TitanGraphBaseTest and the classes that extend it. This has been floating in my stash for a while as I swapped between branches. It's enough to fix TitanGraphBaseTest and subclasses, but I am not convinced that it fixes all instances of the underlying problem (there could be tests that relied on the autocommit behavior that do not extend TGBT).
PR thinkaurelius#1195 contains a fix for a bug that makes the InputFormat throw an exception when attempting to read a self-loop edge. It does not contain a test. This commit adds a test that fails (since the PR has not yet been merged). The PR should fix it.
…e_1195-adeandrade
… into issue_1197-adeandrade
Formatting and changing a boxed Boolean parameter to a primitive
The problem here is that thinkaurelius#1197 caused the first job configuration executed in a JVM to be cached for the life of the JVM. The tests execute multiple jobs with distinct configurations while reusing the same JVM (Hadoop's LocalJobRunner). This commit changes GiraphInputFormat such that the configuration most recently supplied to getConf anywhere in the JVM (this is statically stored) wwill be used any time the refcount of its TitanVertexDeserializer rises from zero to one.
…tan into issue_1208-graben1437-update-slf4j-version
This script properly instantiates "g".
…nto graben1437-fixtestnullptr1
This pull request seems to come with a lot of "extra commits" or something. it makes it a bit hard to evaluate. could you please resubmit this PR against the titan11 branch and we can evaluate it at that point? |
+1 |
Call green lol |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As of Titan 1.0, the implementation of TitanVertexDeserializer does not handle scenario of an Edge connecting a Vertex to itself.
In such a case, the same edge loaded twice, once with IN direction, and another time with OUT. Therefore, the edge is added to the graph twice, and while the first addition succeeds, the second causes an 'Edge with id already exists' exception (because there is an attempt to add an existing key to 'graph.edges' dictionary).
As of Titan 1.1, a new method called isLoopAdded was added in order to fix this issue, which loops through the adjacent vertices of a vertex to check if its on a self-loop.
The same purpose can be achieved if you only add the edge once -
If the edge otherVertexId == vertexId, and the direction is OUT, the vertex is added to the graph, and if the direction is IN, it is ignored.
Doing so prevents the exception.