Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Scrape README from GitHub if the source is provided #292

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion app/models/project/Project.scala
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import ore.project.FlagReasons.FlagReason
import ore.project.{Categories, ProjectMember}
import ore.user.MembershipDossier
import ore.{Joinable, Visitable}
import util.GitHubUtil
import util.StringUtils._

/**
Expand Down Expand Up @@ -446,7 +447,18 @@ case class Project(override val id: Option[Int] = None,
* @return Project home page
*/
def homePage: Page = Defined {
val page = new Page(this.id.get, Page.HomeName, Page.Template(this.name, Page.HomeMessage), false, -1)
var body = Page.HomeMessage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only do it if one is not manually provided, yes? If that is not the behaviour with this PR, it should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you're saying. This scrapes the README if a source URL is provided, and if it is from GitHub.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking if this will overwrite an existing manually-created README.

Copy link
Contributor Author

@phase phase Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a Project is created, it creates a Home Page that has Page.HomeMessage in it. Lines 448 - 451 retrieve the Page from database, passing the page variable as a default. This changes the default to the README it scraped from GitHub. It's basically replacing the text from Page.HomeMessage with the README. So no, it will not replace the Home Page after it has been edited.

val source = this.settings.source
if (source.isDefined && GitHubUtil.isGitHubUrl(source.get)) {
val urlParts = source.get.split("//github.com/")(1).split("/")
val ghUser = urlParts(0)
val ghProject = urlParts(1)
val readme = GitHubUtil.getReadme(ghUser, ghProject)
if (readme != null && readme.isDefined) {
body = readme.get
}
}
val page = new Page(this.id.get, Page.HomeName, Page.Template(this.name, body), false, -1)
this.service.await(page.schema.getOrInsert(page)).get
}

Expand Down
24 changes: 24 additions & 0 deletions app/util/GitHubUtil.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package util

import java.io.FileNotFoundException

import scala.io.Source

object GitHubUtil {

private val identifier = "A-Za-z0-9-_"
private val gitHubUrlPattern = s"""http(s)?://github.com/[$identifier]+/[$identifier]+(/)?""".r.pattern
private val readmeUrl = "https://raw.githubusercontent.com/%s/%s/master/README.md"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this with varying capitalisation?

  • readme.md
  • README.MD
  • readme.MD

Additionally, GitHub supports more than just .md as a file extension for markdown: https://github.com/github/markup/blob/master/lib/github/markup/markdown.rb#L32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently only supports README.md. I'll update it with different types soon.


def isGitHubUrl(url: String): Boolean = gitHubUrlPattern.matcher(url).matches()

def getReadme(user: String, project: String): Option[String] = {
try {
val readme = Source.fromURL(readmeUrl.format(user, project)).mkString
Some(readme)
} catch {
case _: FileNotFoundException => None
}
}

}