-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathcontributing.tex
236 lines (183 loc) · 8.91 KB
/
contributing.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
\chapter{Contributing}
This chapter outlines how to set up an instance of \t{djehuty} with the goal
of modifying its source code. Or in other words: this is the developer setup.
\section{Setting up a development environment}
First, we need to obtain the latest version of the source code:
\begin{lstlisting}[language=bash]
$ git clone https://github.com/4TUResearchData/djehuty.git
\end{lstlisting}
Next, we need to create a somewhat isolated Python environment:
\begin{lstlisting}[language=bash]
$ python -m venv djehuty-env
$ . djehuty-env/bin/activate
[env]$ cd djehuty
[env]$ pip install -r requirements.txt
\end{lstlisting}
And finally, we can install \t{djehuty} in the virtual environment to make
the \t{djehuty} command available:
\begin{lstlisting}[language=bash]
[env]$ sed -e 's/@VERSION@/0.0.1/g' pyproject.toml.in > pyproject.toml
[env]$ pip install --editable .
\end{lstlisting}
If all went well, we will now be able to run \t{djehuty}:
\begin{lstlisting}[language=bash]
[env]$ djehuty --help
\end{lstlisting}
\section{Configuring \t{djehuty}}
Invoking \t{djehuty web} starts the web interface of \t{djehuty}. On what
port it makes itself available can be configured in its configuration file.
An example of a configuration file can be found in
\file{etc/djehuty/djehuty-example-config.xml}. We will use the example
configuration as the basis to configure it for the development environment.
\begin{lstlisting}[language=bash]
[env]$ cp etc/djehuty/djehuty-example-config.xml config.xml
\end{lstlisting}
In the remainder of the chapter we will assume a value of \t{127.0.0.1} for
\t{bind-address} and a value of \t{8080} for \t{port}.
\subsection{Modifications to the example configuration for developers}
The chapter \refer{chap:configuring-djehuty} describes each configuration
option for \t{djehuty}. The remainder of sections here contain a fast-path
through configuring \t{djehuty} for use in a development setup.
\subsubsection{Live reload}
The \t{djehuty} program can be configured to automatically reload itself when
a change is detected by setting \t{live-reload} to \t{1}.
\subsubsection{Configuring authentication with ORCID}
The \t{djehuty} program does not have Identity Provider (IdP) capabilities,
so in order to log into the system we must configure an external IdP. With
an \href{https://orcid.org}{ORCID} account comes the ability to set up an OAuth
endpoint. Go to \href{https://orcid.org/developer-tools}{developer-tools} at
\href{https://orcid.org}{orcid.org}. When setting up the OAuth at ORCID,
choose \t{http://127.0.0.1:8080/login} as redirect URI.
Modify the following bits to reflect the settings obtained from ORCID.
\begin{lstlisting}[language=xml]
<authentication>
<orcid>
<client-id>APP-XXXXXXXXXXXXXXXX</client-id>
<client-secret>XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX</client-secret>
<endpoint>https://orcid.org/oauth</endpoint>
</orcid>
</authentication>
\end{lstlisting}
To limit who can log into a development system, accounts are not automatically
created for ORCID as IdP. So we need to configure who can log in by creating
a record in the \t{privileges} section of the configuration file.
This is also a good moment to configure additional privileges for your account.
In the following snippet, configure the ORCID with which you will log into
the system in the \t{orcid} argument.
\begin{lstlisting}[language=xml]
<privileges>
<account email="[email protected]" orcid="0000-0000-0000-0001">
<may-administer>1</may-administer>
<may-impersonate>1</may-impersonate>
<may-review>1</may-review>
</account>
</privileges>
\end{lstlisting}
\subsection{Invoking \t{djehuty}}
Once we've configured \t{djehuty} for development use, we can start the web
interface by running:
\begin{lstlisting}[language=bash]
[env]$ djehuty web --initialize --config-file=config.xml
\end{lstlisting}
The \t{-{}-initialize} option creates the internal account record and
associates the specified ORCID with it. We only need to run \t{djehuty}
with the \t{-{}-initialize} option once.
By now, we should be able to visit \t{djehuty} through a web browser at
\href{http://127.0.0.1:8080}{localhost:8080}, unless configured differently.
We should be able to log in through ORCID, and access all features of
\t{djehuty}.
\section{Navigating the source code}
In this section, we trace the path from invoking \t{djehuty} to responding
to a HTTP request.
\subsection{Starting point}
Because \t{djehuty} is installable as a Python package, we can find the
starting point for running \t{djehuty} in \t{pyproject.toml}. It reads:
\begin{lstlisting}
[project.scripts]
djehuty = djehuty.ui:main
\end{lstlisting}
So, we start our tour at \file{src/djehuty/ui.py} in the procedure called \t{main}.
\subsection{How \t{djehuty} initializes}
The \t{main} procedure calls \t{main\_inner}, which handles the command-line
arguments. When invoking \t{djehuty}, we usually invoke
\t{djehuty \textbf{web}}, which is handled by the following snippet:
\begin{lstlisting}[language=python]
import djehuty.web.ui as web_ui
...
if args.command == "web":
web_ui.main (args.config_file, True, args.initialize,
args.extract_transactions_from_log,
args.apply_transactions)
\end{lstlisting}
So, the entry-point for the \t{web} subcommand is found in
\t{src/djehuty/web/ui.py} at the \t{main} procedure.
This procedure essentially sets up an instance of \t{WebServer} (found in
\t{src/djehuty/web/wsgi.py} and uses \t{werkzeug}'s \t{run\_simple} to start
the web server.
\subsection{Translating URI paths to internal procedures}
An instance of the \t{WebServer} is passed along in werkzeug's \t{run\_simple}
procedure. Werkzeug calls the instance directly, which is handled by the
\t{\_\_call\_\_} procedure of the \t{WebServer} class. The \t{\_\_call\_\_} procedure
invokes its \t{wsgi} instance, which is configured as following:
\begin{lstlisting}[language=python]
self.wsgi = SharedDataMiddleware(self.__respond, self.static_roots)
\end{lstlisting}
The \t{\_\_respond} procedure calls \t{\_\_dispatch\_request}
In \t{\_\_dispatch\_request}, the requested URI is translated into the procedure
name using the \t{url\_map}. So, except for static resources in the
\t{src/djehuty/web/resources} folder and pre-configured static pages, URIs are
handled by a procedure in the \t{WebServer} instance.
A mapping between a URI and the procedure that is executed to handle the
request to that URI can be found in the \t{url\_map} defined in the
\t{WebServer} class in \file{wsgi.py}.
\subsection{Diving into the code that displays the homepage}
As an example, in the \t{url\_map}, we can find the following line:
\begin{lstlisting}[language=python]
R("/", self.ui_home),
\end{lstlisting}
In this case, \t{self} is a reference to an instance of the \t{WebServer}
class, so we look for a procedure called \t{ui\_home} inside the
\t{WebServer} class. Some code editors have a feature to ``go to definition''
which helps navigating.
The \t{ui\_home} gathers the summary numbers from the SPARQL endpoint
with the following line:
\begin{lstlisting}[language=python]
summary_data = self.db.repository_statistics()
\end{lstlisting}
And a list of the latest datasets with the following line:
\begin{lstlisting}[language=python]
records = self.db.latest_datasets_portal(30)
\end{lstlisting}
It then passes that information to the \t{\_\_render\_template}
procedure which renders the \file{portal.html} in the
\file{src/djehuty/web/resources/html\_templates} folder. The Jinja%
\footnote{\dhref{https://jinja.palletsprojects.com/en/3.1.x/}} package
is used to interpret the template.
\begin{lstlisting}[language=python]
return self.__render_template (request, "portal.html",
summary_data = summary_data,
latest = records, ...)
\end{lstlisting}
\subsection{Database communication}
In the \t{ui\_home} procedure, we found a call to the
\t{self.db.repository\_statistics} procedure. To find out by hand where that
procedure can be found, we can look for the place where \t{self.db} is
assigned a value:
\begin{lstlisting}[language=python]
self.db = database.SparqlInterface()
\end{lstlisting}
And from there look up where \t{database} comes from:
\begin{lstlisting}[language=python]
from djehuty.web import database
\end{lstlisting}
From which we can conclude that it can be found in
\file{src/djehuty/web/database.py}.
In the \t{repository\_statistics} procedure, we find a call to
\t{self.\_\_query\_from\_template} followed by a call to \t{\_\_run\_query}
which takes the output of the former procedure as its input.
As the name implies, \t{\_\_run\_query} sends the query to the SPARQL endpoint
and retrieves the results by putting them in a list of Python dictionaries.
The \t{self.\_\_query\_from\_template} procedure takes one parameter, which is
the name of the template file (minus the extension) that contains a SPARQL
query. These templates can be found in the
\file{src/djehuty/web/resources/sparql\_templates} folder.