Linked-data uses the Resource Description Framework (RDF) to identify resources with Uniform Resource Identifiers (URIs) and describe them with a set of statements, each specifying the value of a given property for the resource. We can represent this in R using a character vector for the URIs together with a data frame for the descriptions. That data frame should include a URI column to identify the resource being described in each row.
uris <- c("http://example.net/id/apple", "http://example.net/id/banana", "http://example.net/id/carrot") labels <- c("Apple","Banana","Carrot") descriptions <- data.frame(uri=uris, label=labels) food <- resource(uris, descriptions)
The resource()
constructor returns a ldf_resource
object that has a variety of methods defined on it, including the format()
generic which allows us to use the labels instead of the URIs when printing to the console.
(picnic <- data.frame(food=food, quantity=c(3,2,0))) #> food quantity #> 1 Apple 3 #> 2 Banana 2 #> 3 Carrot 0
The contents of the vector itself can vary from the attached descriptions this allows you to repeat values without needing to duplicate descriptions:
(kitchen <- data.frame( dish=c("Fruit Salad", "Fruit Salad", "Carrot Salad", "Carrot Salad"), food=resource(c("http://example.net/id/apple", "http://example.net/id/banana", "http://example.net/id/apple", "http://example.net/id/carrot"), descriptions), quantity=c(2,2,1,3))) #> dish food quantity #> 1 Fruit Salad Apple 2 #> 2 Fruit Salad Banana 2 #> 3 Carrot Salad Apple 1 #> 4 Carrot Salad Carrot 3
The underlying identity of each resource in the vector can be retrieved with uri()
:
uri(food) #> [1] "http://example.net/id/apple" "http://example.net/id/banana" #> [3] "http://example.net/id/carrot"
There’s also the curie()
function for retreiving URIs compacted with prefixes (see also: default_prefixes()
):
curie(food, prefixes=c(food="http://example.net/id/")) #> [1] "food:apple" "food:banana" "food:carrot"
You can retrieve the descriptions with description()
:
description(food) #> uri label #> 1 http://example.net/id/apple Apple #> 2 http://example.net/id/banana Banana #> 3 http://example.net/id/carrot Carrot
In order to access individual properties from the resource’s description, you can use property()
:
property(food, "label") #> [1] Apple Banana Carrot #> Levels: Apple Banana Carrot
The second argument is the name of a column from the description.
Since label is such a commonly used property, there’s also a function provided for it: label()
. You can use these functions to perform operations on resources in terms of their descriptions:
food[label(food) == "Apple"] #> <ldf_resource[1]> #> [1] Apple #> Description: uri, label
We use the label to pretty print linked data frames with format.ldf_resource()
:
format(kitchen) #> dish food quantity #> 1 Fruit Salad Apple 2 #> 2 Fruit Salad Banana 2 #> 3 Carrot Salad Apple 1 #> 4 Carrot Salad Carrot 3
Because the base type of resources is character, R will tend to dispatch on this basis. To prevent functions from base R (or other packages that aren’t expecting any novel S3 vectors) from misinterpreting resources, we have as.character()
return the URI and not the resource’s label:
as.character(food) #> [1] "http://example.net/id/apple" "http://example.net/id/banana" #> [3] "http://example.net/id/carrot"
This can be unexpected. The table()
function, for example, returns counts by URI:
table(kitchen$food) #> #> http://example.net/id/apple http://example.net/id/banana #> 2 1 #> http://example.net/id/carrot #> 1
You can use the label by calling table
on that instead:
You can convert a linked data frame back into a “normal” data frame (i.e. one not containing vectors of RDF resources) using the as_dataframe_of_labels()
function. This converts RDF resources into their labels:
kitchen_labels <- as_dataframe_of_labels(kitchen) str(kitchen_labels) #> 'data.frame': 4 obs. of 3 variables: #> $ dish : Factor w/ 2 levels "Carrot Salad",..: 2 2 1 1 #> $ food : Factor w/ 3 levels "Apple","Banana",..: 1 2 1 3 #> $ quantity: num 2 2 1 3