scala-uri
is a small Scala library that helps you work with URIs. It has the following features:
- A RFC 3986 compliant parser to parse URLs and URNs from Strings
- URL Builders to create URLs from scratch
- Ability to transform query strings with methods such as filterQuery and mapQuery
- Ability to replace and remove query string parameters
- Ability to extract TLDs and public suffixes such as
.com
and.co.uk
from hosts - Ability to render URLs in punycode
- Ability to parse IPv6 and IPv4 addresses
- Support for custom encoding such as encoding spaces as pluses
- Support for protocol relative urls
- Support for user information e.g.
ftp://user:[email protected]
- Support for URNs
- Support for mailto URLs
- Support for scala-js
- No dependencies on existing web frameworks
To include it in your SBT project from maven central:
"io.lemonlabs" %% "scala-uri" % "1.4.1"
Migration Guide from 0.5.x
There are also demo projects for both scala and scala-js to help you get up and running quickly.
Note: This library works best when using Scala 2.11.2+
. Due a bug in older versions of Scala, this library can result in StackOverflowException
s for very large URLs when using versions of Scala older than 2.11.2
. More details
import io.lemonlabs.uri.Url
val url = Url.parse("https://www.scala-lang.org")
The returned value has type Url
with an underlying implementation of AbsoluteUrl
, RelativeUrl
,
UrlWithoutAuthority
or ProtocolRelativeUrl
. If you know your URL will always be one of these types, you can
use the following parse
methods to get a more specific return type
import io.lemonlabs.uri._
val absoluteUrl = AbsoluteUrl.parse("https://www.scala-lang.org")
val relativeUrl = RelativeUrl.parse("/index.html")
val mailtoUrl = UrlWithoutAuthority.parse("mailto:[email protected]")
val protocolRelativeUrl = ProtocolRelativeUrl.parse("//www.scala-lang.org")
import io.lemonlabs.uri.Urn
val urn = Urn.parse("urn:isbn:0981531687")
urn.schemeOption // This is Some("urn")
urn.nid // This is "isbn"
urn.nss // This is "0981531687"
You can use Uri.parse
to parse URNs as well as URLs. Url.parse
and Urn.parse
are preferable as they return
a more specific return type
Url
provides an apply method with a bunch of optional parameters that can be used to build URLs
import io.lemonlabs.uri.{Url, QueryString}
val url = Url(scheme = "http", host = "lemonlabs.io", path = "/opensource")
val url2 = Url(path = "/opensource", query = QueryString.fromPairs("param1" -> "a", "param2" -> "b"))
The mapQuery
method will transform the Query String of a URI by applying the specified PartialFunction
to each
Query String Parameter. Any parameters not matched in the PartialFunction
will be left as-is.
import io.lemonlabs.uri.Url
val uri = Url.parse("/scala-uri?p1=one&p2=2&p3=true")
// Results in /scala-uri?p1_map=one_map&p2_map=2_map&p3_map=true_map
uri.mapQuery {
case (n, Some(v)) => (n + "_map", Some(v + "_map"))
}
The mapQueryNames
and mapQueryValues
provide a more convenient way to transform just Query Parameter names or values
import io.lemonlabs.uri.Url
val uri = Url.parse("/scala-uri?p1=one&p2=2&p3=true")
uri.mapQueryNames(_.toUpperCase) // Results in /scala-uri?P1_map=one&P2=2&P3=true
uri.mapQueryValues(_.replace("true", "false")) // Results in /scala-uri?p1=one&p2=2&p3=false
The filterQuery
method will remove any Query String Parameters for which the provided Function returns false
import io.lemonlabs.uri.Url
val uri = Url.parse("/scala-uri?p1=one&p2=2&p3=true")
// Results in /scala-uri?p2=2
uri.filterQuery {
case (n, v) => n.contains("2") && v.contains("2")
}
uri.filterQuery(_._1 == "p1") // Results in /scala-uri?p1=one
The filterQueryNames
and filterQueryValues
provide a more convenient way to filter just by Query Parameter name or value
import io.lemonlabs.uri.Url
val uri = Url.parse("/scala-uri?p1=one&p2=2&p3=true")
uri.filterQueryNames(_ > "p1") // Results in /scala-uri?p2=2&p3=true
uri.filterQueryValues(_.length == 1) // Results in /scala-uri?p2=2
The collectQuery
method will transform the Query String of a URI by applying the specified PartialFunction
to each
Query String Parameter. Any parameters not matched in the PartialFunction
will be removed.
import io.lemonlabs.uri.Url
val uri = Url.parse("/scala-uri?p1=one&p2=2&p3=true")
// Results in /scala-uri?p1_map=one_map
uri.collectQuery {
case ("p1", Some(v)) => ("p1_map", Some(v + "_map"))
}
import io.lemonlabs.uri.Url
val absoluteUrl = Url.parse("http://www.example.com/example?a=b")
absoluteUrl.toRelativeUrl // This is /example?a=b
import io.lemonlabs.uri.Url
val relativeUrl = Url.parse("/example?a=b")
relativeUrl.withScheme("http").withHost("www.example.com") // This is http://www.example.com/example?a=b
import io.lemonlabs.uri.Url
val uri: Uri = Uri.parse(...)
uri match {
case Uri(path) => // Matches Urns and Urls
case Urn(path) => // Matches Urns
case Url(path, query, fragment) => // Matches Urls
case RelativeUrl(path, query, fragment) => // Matches RelativeUrls
case UrlWithAuthority(authority, path, query, fragment) => // Matches AbsoluteUrls and ProtocolRelativeUrls
case AbsoluteUrls(scheme, authority, path, query, fragment) => // Matches AbsoluteUrls
case ProtocolRelativeUrls(authority, path, query, fragment) => // Matches ProtocolRelativeUrls
case UrlWithoutAuthority(scheme, path, query, fragment) => // Matches UrlWithoutAuthoritys
}
In some cases scalac
will be able to detect instances where not all cases are being matched. For example:
import io.lemonlabs.uri.Uri
Uri.parse("/test") match {
case u: Url => println(u.toString)
}
results in the following compiler warning, because Uri.parse can return Urn
s as well as Url
s:
<console>:15: warning: match may not be exhaustive.
It would fail on the following input: Urn(_)
In this instance, using Url.parse
instead of Uri.parse
would fix this warning
You can parse a String representing the host part of a URI with Host.parse
. The return type is Host
with an
underling implementation of DomainName
, IpV4
or IpV6
.
import io.lemonlabs.uri.Host
val host = Host.parse("lemonlabs.io")
import io.lemonlabs.uri.{IpV4, IpV6}
val ipv4 = IpV4.parse("13.32.214.142")
val ipv6 = IpV6.parse("[1:2:3:4:5:6:7:8]")
import io.lemonlabs.uri.Host
val host: Host = Host.parse(...)
host match {
case Host(host) => // Matches DomainNames, IpV4s and IpV6s
case DomainName(host) => // Matches DomainNames
case ip: IpV4 => // Matches IpV4s
case ip: IpV6 => // Matches IpV6s
}
import io.lemonlabs.uri.Path
val path: Path = Path.parse(...)
path match {
case Path(parts) => // Matches any path
case AbsolutePath(parts) => // Matches any path starting with a slash
case Rootless(parts) => // Matches any path that *doesn't* start with a slash
case PathParts("a", "b", "c") => // Matches "/a/b/c" and "a/b/c"
case PathParts("a", "b", _*) => // Matches any path starting with "/a/b" or "a/b"
case EmptyPath() => // Matches ""
case PathParts() => // Matches "" and "/"
case UrnPath("nid", "nss") => // Matches a URN Path "nid:nss"
}
By Default, scala-uri
will URL percent encode paths and query string parameters. To prevent this, you can call the uri.toStringRaw
method:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://example.com/path with space?param=üri")
uri.toString // This is: http://example.com/path%20with%20space?param=%C3%BCri
uri.toStringRaw // This is: http://example.com/path with space?param=üri
The characters that scala-uri
will percent encode by default can be found here. You can modify which characters are percent encoded like so:
Only percent encode the hash character:
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.encoding._
implicit val config = UriConfig(encoder = percentEncode('#'))
Percent encode all the default chars, except the plus character:
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.encoding._
implicit val config = UriConfig(encoder = percentEncode -- '+')
Encode all the default chars, and also encode the letters a and b:
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.encoding._
implicit val config = UriConfig(encoder = percentEncode ++ ('a', 'b'))
The default behaviour with scala uri, is to encode spaces as %20
, however if you instead wish them to be encoded as the +
symbol, then simply add the following implicit val
to your code:
import io.lemonlabs.uri.Url
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.encoding._
implicit val config = UriConfig(encoder = percentEncode + spaceAsPlus)
val uri = Url.parse("http://theon.github.com/uri with space")
uri.toString // This is http://theon.github.com/uri+with+space
If you would like to do some custom encoding for specific characters, you can use the encodeCharAs
encoder.
import io.lemonlabs.uri.Url
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.encoding._
implicit val config = UriConfig(encoder = percentEncode + encodeCharAs(' ', "_"))
val uri = Url.parse("http://theon.github.com/uri with space")
uri.toString // This is http://theon.github.com/uri_with_space
By Default, scala-uri
will URL percent decode paths and query string parameters during parsing:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://example.com/i-have-%25been%25-percent-encoded")
uri.toString // This is: http://example.com/i-have-%25been%25-percent-encoded
uri.toStringRaw // This is: http://example.com/i-have-%been%-percent-encoded
To prevent this, you can bring the following implicit into scope:
import io.lemonlabs.uri.Url
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.decoding.NoopDecoder
implicit val c = UriConfig(decoder = NoopDecoder)
val uri = Url.parse("http://example.com/i-havent-%been%-percent-encoded")
uri.toString // This is: http://example.com/i-havent-%25been%25-percent-encoded
uri.toStringRaw // This is: http://example.com/i-havent-%been%-percent-encoded
If your Uri contains invalid percent encoding, by default scala-uri will throw a UriDecodeException
:
import io.lemonlabs.uri.Url
Url.parse("/?x=%3") // This throws a UriDecodeException
You can configure scala-uri to instead ignore invalid percent encoding and only percent decode correctly percent encoded values like so:
import io.lemonlabs.uri.Url
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.decoding.PercentDecoder
implicit val c = UriConfig(
decoder = PercentDecoder(ignoreInvalidPercentEncoding = true)
)
val uri = Url.parse("/?x=%3")
uri.toString // This is /?x=%253
uri.toStringRaw // This is /?x=%3
If you wish to replace all existing query string parameters with a given name, you can use the Url.replaceParams()
method:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://example.com/path?param=1")
val newUri = uri.replaceParams("param", "2")
newUri.toString // This is: http://example.com/path?param=2
If you wish to remove all existing query string parameters with a given name, you can use the uri.removeParams()
method:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://example.com/path?param=1¶m2=2")
val newUri = uri.removeParams("param")
newUri.toString // This is: http://example.com/path?param2=2
scala-uri has support for not rendering query parameters that have a value of None
. Set renderQuery = ExcludeNones
in your UriConfig
and make it visible in the scope where you parse/create your Url
import io.lemonlabs.uri.Url
import io.lemonlabs.uri.config._
implicit val config: UriConfig = UriConfig(renderQuery = ExcludeNones)
val url = Url.parse("http://github.com/lemonlabsuk").addParamsOptionValues("a" -> Some("some"), "b" -> None)
url.toString // This is http://github.com/lemonlabsuk?a=some
To get the query string parameters as a Map[String,Seq[String]]
you can do the following:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://example.com/path?a=b&a=c&d=e")
uri.query.paramMap // This is: Map("a" -> Vector("b", "c"), "d" -> Vector("e"))
scala-uri
supports user information (username and password) encoded in URLs.
Parsing URLs with user information:
import io.lemonlabs.uri.Url
val url = Url.parse("http://user:[email protected]")
url.user // This is Some("user")
url.password // This is Some("pass")
Modifying user information:
import io.lemonlabs.uri.AbsoluteUrl
val url = AbsoluteUrl.parse("http://host.com")
url.withUser("jack") // URL is now http://[email protected]
import io.lemonlabs.uri.AbsoluteUrl
val url = AbsoluteUrl.parse("http://user:[email protected]")
url.withPassword("secret") // URL is now http://user:[email protected]
Note: that using clear text passwords in URLs is ill advised
Protocol Relative URLs are supported in scala-uri
. A Uri
object with a protocol of None
, but a host of Some(x)
will be considered a protocol relative URL.
import io.lemonlabs.uri.Url
val uri = Url.parse("//example.com/path") // Return type is Url
uri.schemeOption // This is: None
uri.hostOption // This is: Some("example.com")
Use ProtocolRelativeUrl.parse
if you know your URL will always be Protocol Relative:
import io.lemonlabs.uri.ProtocolRelativeUrl
val uri = ProtocolRelativeUrl.parse("//example.com/path") // Return type is ProtocolRelativeUrl
uri.schemeOption // This is: None
uri.host // This is: "example.com"
By default scala-uri
uses UTF-8
charset encoding:
import io.lemonlabs.uri.Url
val uri = Url.parse("http://theon.github.com/uris-in-scala.html?chinese=网址")
uri.toString // This is http://theon.github.com/uris-in-scala.html?chinese=%E7%BD%91%E5%9D%80
This can be changed like so:
import io.lemonlabs.uri.config.UriConfig
import io.lemonlabs.uri.Url
implicit val conf = UriConfig(charset = "GB2312")
val uri = Url.parse("http://theon.github.com/uris-in-scala.html?chinese=网址")
uri.toString // This is http://theon.github.com/uris-in-scala.html?chinese=%CD%F8%D6%B7
Note: Currently not supported for scala-js
import io.lemonlabs.uri.Url
// This returns Some("www")
Url.parse("http://www.example.com/blah").subdomain
// This returns Some("a.b.c")
Url.parse("http://a.b.c.example.com/blah").subdomain
// This returns None
Url.parse("http://example.com/blah").subdomain
// This returns Vector("a", "a.b", "a.b.c", "a.b.c.example")
Url.parse("http://a.b.c.example.com/blah").subdomains
// This returns Some("a")
Url.parse("http://a.b.c.example.com/blah").shortestSubdomain
// This returns Some("a.b.c.example")
Url.parse("http://a.b.c.example.com/blah").longestSubdomain
These methods return None
or Vector.empty
for URLs without a Host (e.g. Relative URLs)
Note: Currently not supported for scala-js
The method apexDomain
returns the apex domain
for the URL (e.g. example.com
for http://www.example.com/path
)
import io.lemonlabs.uri.Url
val uri = Url.parse("http://www.google.co.uk/blah")
uri.apexDomain // This returns Some("google.co.uk")
Note: Currently not supported for scala-js
scala-uri
uses the list of public suffixes from publicsuffix.org to allow you to identify
the TLD of your absolute URIs.
The publicSuffix
method returns the longest public suffix from your URI
import io.lemonlabs.uri.Url
val uri = Url.parse("http://www.google.co.uk/blah")
uri.publicSuffix // This returns Some("co.uk")
The publicSuffixes
method returns all the public suffixes from your URI
import io.lemonlabs.uri.Url
val uri = Url.parse("http://www.google.co.uk/blah")
uri.publicSuffixes // This returns Vector("co.uk", "uk")
These methods return None
and Vector.empty
, respectively for URLs without a Host (e.g. Relative URLs)
Note: Currently not supported for scala-js
See RFC 3490
import io.lemonlabs.uri.Url
val url = Url.parse("https://はじめよう.みんな/howto.html")
url.toStringPunycode // This returns "https://xn--p8j9a0d9c9a.xn--q9jyb4c/howto.html"
Mailto URLs are best parsed with UrlWithoutAuthority.parse
, but can also be parsed with Url.parse
import io.lemonlabs.uri.UrlWithoutAuthority
val mailto = UrlWithoutAuthority.parse("mailto:[email protected]?subject=Hello")
mailto.scheme // This is Some(mailto")
mailto.path // This is "[email protected]"
mailto.query.param("subject") // This is Some("Hello")
By importing io.lemonlabs.uri.dsl._
, you may use a DSL to construct URLs
import io.lemonlabs.uri.dsl._
// Query Strings
val uri = "http://theon.github.com/scala-uri" ? ("p1" -> "one") & ("p2" -> 2) & ("p3" -> true)
uri.toString //This is: http://theon.github.com/scala-uri?p1=one&p2=2&p3=true
val uri2 = "http://theon.github.com/scala-uri" ? ("param1" -> Some("1")) & ("param2" -> None)
uri2.toString //This is: http://theon.github.com/scala-uri?param1=1¶m2
val uri3 = "http://theon.github.com/scala-uri" ? "param1=1"
uri3.toString //This is: http://theon.github.com/scala-uri?param1=1
// Paths
val uri4 = "http://theon.github.com" / "scala-uri"
uri4.toString //This is: http://theon.github.com/scala-uri
// Fragments
val uri5 = "http://theon.github.com/scala-uri" `#` "fragments"
uri5.toString //This is: http://theon.github.com/scala-uri#fragments
See scala-uri-scalajs-example for usage
scala-uri
1.x.x
is currently built with support for scala 2.12.x
, 2.11.x
- For
2.10.x
support usescala-uri
0.4.17
from branch0.4.x
- For
2.9.x
support usescala-uri
0.3.6
from branch0.3.x
Release builds are available in maven central. For SBT users just add the following dependency:
"io.lemonlabs" %% "scala-uri" % "1.4.1"
For maven users you should use (for 2.12.x):
<dependency>
<groupId>io.lemonlabs</groupId>
<artifactId>scala-uri_2.12</artifactId>
<version>1.4.1</version>
</dependency>
Contributions to scala-uri
are always welcome. Check out the Contributing Guidelines
Thanks to @evanbennett. 1.x.x
is inspired by his fork here
and discussion here.
- Package change from
com.netaporter.uri
toio.lemonlabs.uri
- The single
Uri
case class has now been replaced with a class hierarchy. Use the most specific class in this hierarchy that fits your use case Uri
used to be a case class, but the replacementsUri
andUrl
are now traits. This means they no longer have acopy
method. Use thewith
methods instead (e.g.withHost
,withPath
etc)host
method onUrl
now has return typeHost
rather thanString
. You may have to changeurl.host
tourl.host.toString
path
method onUrl
now has return typePath
rather thanString
. You may have to changeurl.path
tourl.path.toString
- Changed parameter value type from
Any
toString
in methodsaddParam
,addParams
,replaceParams
. Please now call.toString
before passing non String types to these methods - Changed parameter value type from
Option[Any]
toOption[String]
in methodreplaceAll
. Please now call.toString
before passing non String types to this method - Query string parameters with a value of
None
will now be rendered with no equals sign by default (e.g.?param
). Previously some methods (such as?
,&
,\?
,addParam
andaddParams
) would not render parameters with a value ofNone
at all. In 1.x.x, this behaviour can be achieved by using therenderQuery
config option. - In most cases
Url.parse
should be used instead ofUri.parse
. See all parse methods here scheme
is now calledschemeOption
onUri
. If you have an instance ofAbsoluteUrl
orProtocolRelativeUrl
there is stillscheme
method but it returnsString
rather thanOption[String]
protocol
method has been removed fromUri
. UseschemeOption
instead- Type changed from
Seq
toVector
for:subdomains
,publicSuffixes
,params
return typeremoveAll
andremoveParams
argument typesparams
field inQueryString
paramMap
andpathParts
fields inUri
, nowUrl
- Methods
addParam
andaddParams
that took Option arguments are now calledaddParamOptionValue
andaddParamsOptionValues
- Method
replaceAllParams
has been replaced withwithQueryString
orwithQueryStringOptionValues
- Method
removeAllParams
has been replaced withwithQueryString(QueryString.empty)
- Method
subdomain
has been removed from the scala-js version. The implementation was incorrect and did not match the JVM version ofsubdomain
. Once public suffixes are supported for the scala-js version, a correct implementation ofsubdomain
can be added - Implicit
UriConfig
s now need to be where yourUri
s are parsed/constructed, rather than where they are rendered - Method
hostParts
has been removed fromUri
. This method predatedpublicSuffix
andsubdomain
which are more useful methods for pulling apart a host - Field
pathStartsWithSlash
removed fromUri
. This was only intended to be used internally. You can now instead check ifUri.path
is an instance ofAbsolutePath
to determine if the path will start with slash
- Matrix parameters have been removed. If you still need this, raise an issue
- scala 2.10 support dropped, please upgrade to 2.11 or 2.12 to use scala-uri 0.5.x
- scala-js support added
- Package changes / import changes
- All code moved from
com.github.theon
package tocom.netaporter
package scala-uri
has been organised into the following packages:encoding
,decoding
,config
anddsl
. You will need to update import statments.- Name changes
PermissiveDecoder
renamed toPermissivePercentDecoder
QueryString
andMatrixParams
constructor argumentparameters
shortened toparams
Uri.parseUri
renamed toUri.parse
protocol
constructor arg inUri
renamed toscheme
Querystring
renamed toQueryString
- Query String constructor argument
parameters
changed type fromMap[String, List[String]]
toSeq[(String,String)]
Uri
constructor argumentpathParts
changed type fromList
toVector
Uri
method to add query string parameters renamed fromparams
toaddParams
. Same withmatrixParams
->addMatrixParams
PercentEncoderDefaults
object renamed toPercentEncoder
companion object.- Copy methods
user
/password
/port
/host
/scheme
now all prefixed withwith
, e.g.withHost
- New
UriConfig
case class used to specify encoders, decoders and charset to be used. See examples in Custom encoding, URL Percent Decoding and Character Sets
scala-uri
is open source software released under the Apache 2 License.