Archive for the ‘e-Government’ Category

The IMPACT Project — first two days

Tuesday, February 2nd, 2010

As I mentioned in a previous post, I am working in Amsterdam for the next three months on setting up a research project at the Leibniz Center for Law. The focus here is to develop information extract of textual debates (using GATE) and a tool for inputting debates in a structured manner that can be further processed for reasoning.

The official IMPACT Project information on CORDIS.

As part of my contribution, I have two draft papers, written in the spring and summer of 2009, which will be further developed at Leibniz: From Arguments in Natural Language to Argumentation Frameworks and Multi-modal Multi-threaded Online Forums. While these are early drafts of papers and not for wider circulation, they give a good indication of the line of thinking and of some of the key ideas we will be pursuing. Comments about these works are very welcome.

By Adam Wyner
Distributed under the Creative Commons
Attribution-Non-Commercial-Share Alike 2.0

Discussion with Jeremy Tobias-Tarsh of Practical Law Company

Monday, January 18th, 2010

On Wednesday January 13, 2010, I had a meeting with Jeremy Tobias-Tarsh, director of Practical Law Company (PLC) and currently in charge of overseeing the company’s three year development plan. We had a very engaging, far-ranging discussion about the company’s interests in technological innovation in the legal domain. His colleagues at the meeting where Brigitte Kaltenbacher, who works on usability tests for searches among the company’s resources, and Sara Stangalini, who works with Brigitte.

The post gives an overview of our discussion — what PLC does, the ambitions for the future, a range of issues and tools to handle them, and some suggestions about moving ahead.

About PLC

PLC provides know-how for lawyers, meaning written analysis of current legal developments, practice notes (legal situations lawyers face and how the law treats them), standard draft documents, and checklists for managing actions. The services cover a range of legal areas such as arbitration, competition, corporate, construction, employment, finance, pensions, tax, and so on.

Jeremy spoke of an ambition at the company to use Semantic Web technologies on the company’s resources in order to give users faster, more precise, more meaningful and relevant results for searches in the resources — making the company’s content more findable. This might be done by annotating the content of the resources and supporting search with respect to the annotations. (Along these lines, an important advantage is that the company has been using an XML editor (Epic) for its documents for some time, so there is broad and widespread familiarity with what XML offers.)

Similarly, PLC could develop tools which improve the searches among a law firm’s documents. This is especially crucial where searches are done by junior staff with less knowledge of how and where to search. As made clear in discussions of knowledge management in law firms, an important task of senior lawyers in a firm is to train the new and junior lawyers in the details of the practice. While law schools may train law students in legal analysis and the law, the students may be unprepared for how to practice, which may have less to do with the law and more to do with finding and working with the relevant documents.

Any technology which can support junior lawyers in learning their tasks would be an advantage. In addition, any technology which could encode a senior lawyer’s knowledge would be useful to share throughout the firm and to preserve that knowledge where the lawyer is unavailable.

Some Sample Problems and Tools

Contracts

An instance of such a tool might apply to contracts. PLC and firms have catalogues of preformatted draft documents, each of which may have variants developed over time. This may be seen as a contract base. A junior lawyer may be asked to find among this contract base a contract which is either an exact match for the current circumstances or close enough so that with some modifications it would suit. This can be viewed as an instance of case based reasoning, where the ‘factors’ are the particulars of the contracts and the current contractual setting. So, not only must there be some way to match similarity and difference among the documents, but there ought also to be some systematic way to manage the modifications.

To address this, three technologies could be used. Contracts could be annotated with the factors, then we apply case based reasoning. Alternatively, contracts could be linked to an ontology, so that the properties and relationships among the documents are made explicit. Researchers could search for the relevant documents using the ontology. Along with this, a contract modification tracking system, such as a modified version of which meets the MetaLex standard, could be developed.

Due Diligence

Another problem relates to due diligence. Law firms are up against constraints in terms of time and money in satisfying the requirements of due diligence. Firms increasingly are responsible to show due diligence in a wider range of areas. This means that more lawyers must be hired and more billable hours accrued. However, the companies hired by the law firms are reluctant to pay more for due diligence. Consequently, firms have a motivation to find ways to make due diligence more efficient. Moreover, it is not a task that junior lawyers can easily undertake without extensive training. Natural language expert systems might provide a useful technology.

Policy Consultations

We also had a discussion about policy consultations. PLC helped formed and serves as secretariat for the General Counsel 100 Group, which is comprised of senior legal officers drawn from FTSE 100 companies. The group is a forum for businesses to give input on policy consultations and to share best practices in law, risk management, compliance, and other common interests (see the various public papers on the link). In my EU Framework 7 proposal on argumentation, we explicitly referred to policy consultation as a key area to develop and apply the tool. Broadly speaking, we had a systematic plan to develop a tool which takes as input statements in natural language, then translates them into a logical formalism. Claims pro and con on a particular issue are systematically structured into an ‘argument’ network in order to ‘prove’ outcomes given premises as well as to provide sets of consistent statements for and against a claim. Other argument mapping technologies might be useful here as well.

Ontologies

We also talked about the development of ontologies and whether they can be automatically extracted from textual sources. This is an area where there is a lot of current interest and some significant progress.

Moving Ahead

Finally, we also touched on how to move ahead. A brainstorming and road-mapping exercisea could be very valuable experience. The exercise would include not only company representatives, but also clients served by PLC. Parties on ‘both sides of the fence’ could discover more about what they know, want, and imagine could be done. In addition, Jeremy suggested that I might be engaged to present some of the ‘main points’ about Semantic Web technologies and the law to some of PLC’s editors and clients.

It was an enjoyable and spirited discussion, which I hope we will find the opportunity in the near future to continue.

By Adam Wyner
Distributed under the Creative Commons
Attribution-Non-Commercial-Share Alike 2.0

Research on Argumentation at the Leibniz Center for Law in Amsterdam

Monday, January 4th, 2010

I have a 3 month research job at the Leibniz Center for Law, University of Amsterdam starting February 1 and working with Tom van Engers. This is part of the IMPACT project:

IMPACT is an international project, partially funded by the European Commission under the 7th framework programme. It will conduct original research to develop and integrate formal, computational models of policy and arguments about policy, to facilitate deliberations about policy at a conceptual, language-independent level. To support the analysis of policy proposals in an inclusive way which respects the interests of all stakeholders, research on tools for reconstructing arguments from data resources distributed throughout the Internet will be conducted. The key problem is translation from these sources in natural language to formal argumentation structures, which will be input for automatic reasoning.

My role will be to set up a Ph.D. research project concerning the key problem. This is based on an unsuccessful larger research proposal that I made with Tom. I’ll be organising the database, the literature, some of the software, and outlining the approach the student would take. I’ll make notes on the progress as it happens.

I’m looking forward to living for a while in Amsterdam, working with Tom and my other colleagues at the center — Joost Breuker, Rinke Hoekstra, Emile de Maat. The Netherlands also has a very lively Department of Argumentation Theory. As an added bonus, my colleagues from Linguistics, Susan Rothstein and Fred Landman, are in Amsterdam on sabbatical. Will be a very interesting and fun period.

Annotating Rules in Legislation

Friday, November 27th, 2009

Over the last couple of months, I have had discussions about text mining and annotating rules in legislation with several people (John Sheridan of The Office of Public Sector Information, Richard Goodwin of The Stationery Office, and John Cyriac of Compliance Track). While nothing yet concrete has resulted from these discussions, it is clearly a “hot topic”.

In the course of these discussions, I prepared a short outline of the issues and approaches, which I present below. Comments, suggestions, and collaborations are welcome.

Vision, context, and objectives

One of the main visions of artificial intelligence and law has been to develop a legislative processing tool. Such a tool has several related objectives:

      [1.] To guide the drafter to write well-formed legal rules in natural language.
      [2.] To automatically parse and semantically represent the rules.
      [3.] To automatically identify and annotate the rules so that they can be extracted from a corpus of legislation for web-based applications.
      [4.] To enable inference, modeling, and consistency testing with respect to the rules.
      [5.] To reason with respect to domain knowledge (an ontology).
      [6.] To serve the rules on the web so that users can use natural language to input information and receive determinations.

While no such tool exists, there has been steady progress on understanding the problems and developing working software solutions. In early work (see The British nationality act as a logic program (1986)), an act was manually translated into a program, allowing one to draw inferences given ground facts. Haley is a software and service company which provides a framework which partially addresses 1, 2, 4, and 6 (see Policy Automation). Some research addresses aspects of 3 (see LKIF-Core Ontology). Finally, there are XML annotation schemas for legislation (and related input support) such as The Crown XML Schema for Legislation and Akoma Ntoso, both of which require manual input. Despite these advances, there is much progress yet to be made. In particular, no results fulfill [3.].

In consideration of [3.], the primary objective of this proposal is to use the General Architecture for Text Engineering (GATE) framework in order to automatically identify and annotate legislative rules from a corpus. The annotation should support web-based applications and be consistent with semantic web mark ups for rules, e.g. RuleML. A subsidiary objective is to define an authoring template which can be used within existing authoring applications to manually annotate legislative rules.

Benefits

Attaining these objectives would:

  • Support automated creation, maintenance, and distribution of rule books for compliance.
  • Contribute to the development of a legislative processing tool.
  • Make legislative rules accessible for web-based applications. For example, given other annotations, one could identify rules that apply with respect to particular individuals in an organisation along with relevant dates, locations, etc.
  • Enable further processing of the rules such as removing formatting, parsing the content of the rules, and representing them semantically.
  • Allow an inference engine to be applied over the formalised rule base.
  • Make legislation more transparent and communicable among interested parties such as government departments, EU governments, and citizenry.

Scope

To attain the objectives, we propose the following phases, where the numbers represent weeks of effort:

  • Create a relatively small sample corpus to scope the study.
  • Manually identify the forms of legislative rules within the corpus.
  • Develop or adapt an annotation scheme for rules.
  • Apply the analysis tools of GATE and annotate the rules.
  • Validate that GATE annotates the rules as intended.
  • Apply the annotation system to a larger corpus of documents.

For each section, we would produce a summary of results, noting where difficulties are encountered and ways they might be addressed.

Extending the work

The work can be extended in a variety of ways:

  • Apply the GATE rules to a larger corpus with more variety of rule forms.
  • Process the rules for semantic representation and inference.
  • Take into consideration defeasiblity and exceptions.
  • Develop semantic web applications for the rules.

By Adam Wyner
Distributed under the Creative Commons
Attribution-Non-Commercial-Share Alike 2.0

London’s DataStore Workshop

Saturday, October 24th, 2009

Today I attended a workshop organised by the Greater London Authority (GLA), which is the citywide government for London. The workshop was held at City Hall on the top floor where we had a splendid view over the Thames, of Tower Bridge, and the Tower of London.

The GLA is in the process of scoping a datastore for information London. The objective is to begin to encourage development of “government 2.0″ using open government data along the lines of what has been done in San Franscisco in the US (see an article on DataSF and a post by San Francisco Mayor Gavin Newsom). The principle idea is that by putting data of public interest into the public domain, the government can provide the basis for development of applications and services for the government, business community, and public. For example, using police data, one can generate crime maps.

At the GLA meeting, the objective was to meet with the developer community to get ideas and feedback on what and how the data should be released as well as how best to encourage applications in the near future.

Clearly, the GLA meeting is along the lines of what is happening elsewhere in the UK government (see Digital Engagement at the Cabinet Office, the Office of Public Sector Information, and The Stationery Office).

There were some 70 participants at the meeting, and we can look forward to further information coming from the organisers at the GLA. Some very useful suggestions where made about where to get further information such as the Technology Strategy Board which supports technology development in the UK.

Among the topics of discussion where:

  • What sort of data should be released and in what form? There were those who wanted it raw and those who wanted it structured. Likely releasing it both forms will occur.
  • How to get licensing for the data? There are a host of difficult issues here, as most of the data is owned or copyrighted by a range of organisations, each of whom wants to control the flow of information, profit from it, or has concerns about security/liability. Moreover, the government contracts information service providers, which process the data, may have some legal claim. Such providers may be required to make their data open.
  • How would the data be used? There were many suggestions about data reuse and mash up, mostly along the lines of existing applications such as mapping data to physical maps in order to get ideas about what is happening where in neighborhoods, transportation assistance, information access in a local area, and so on.
  • Who would develop the applications and how would development be funded? Clearly there is an issue about funding, but some of the ways around it are to leverage funding between academic, government, and business communities.
  • Who to consult about applications? A range of parties might be consulted about what they would find useful, from the person on the street, to members of service organisations (police, licensing, etc), to higher level government organisations. Alternative, the GLA or similar organisations might develop applications which they thought would be useful, then provide them to the public. Here, the focus would be on small, manageable, pilot projects to show proof of concept.

The discussion was very nicely organised and led — several large tables around a circle, several large monitors, several boards for writing, an MC who kept things moving along, and a good overall atmosphere. However, missing were a list of participants, contact information, and a short (three sentence) statement of interest; hopefully, all this will appear soon.

This is all very interesting and exciting, for things are just beginning to happen. However, I have some concerns and realise I have a somewhat different focus.

  • There were too few participants from government services and academia. This is also reflected in the gap between the technology and the data, since it is highly unclear who is developing what for whom and what purpose? It is hard to get a handle on how data should be served without some sense of goals. Nonetheless, likely there will be another meet-up at which this more substantive discussion will happen.
  • There was only passing mention of ontologies, annotation, information extraction, and the semantic web. The absence of semantic web concepts suggests that “reasoning” and complex information management is not high on the agenda. This is consistent with the family of application ideas (graphs, maps, local information). While I was told that there were ontologies via OPSI, this is not what I understood from John Sheridan in my recent discussion, so I will be eager to see exactly what this is.
  • Similarly, there was some discussion about whether there should be or could be standards and schemas for the data. These are always compatible (make them both available), but I see standards are essential for any communication across agencies and localities. There are drawbacks to standards development, but the issue will arise sooner or later in any case.
  • There is, as yet, no the pitch. In other words, what is the incentive for anyone to make their data available or for organisations to otherwise cooperate with this endeavour? The only mentions of incentive were government obligation, but this is perhaps the most heavy handed way to make headway. Rather, there should be positive incentives. In addition, the eGovernment agenda should be pushed (e.g. transparency, support for government, participation, efficiency, cost reduction, consistency….).
  • There was little discussion of exactly which technologies would be used though RDF/XML and REST were mentioned. These are generic and widespread; is the real hangup right now data access, or is there some technological issue? If I wanted to know what I should need to know to program and provide a simple service, what would I have to know and do?
  • Despite the widespread interest in government 2.0, there is little vertical/horizontal integration or communication among the interested parties. There is not, apparently, a coherent website or ‘state of the art’ article with links to the relevant data/functionalities/support organisations.
  • There was no over-arching conception of design or context for applications. Likely some sort of ‘apps’ or plugins framework will emerge so that, for example, a local council would build none of its own applications or services, but these would be provided as plugins by independent providers, yet given a consistent style and structure.
  • Though there are claims that there have been consultations about government 2.0 with the various interested parties, there is no clear presentation of the results of those consultations. A ‘brain-storming’ site would be very useful.
  • It is unclear to me the extent to which the participants have the political/social context in mind. While we were hosted by the GLA and discussed GLA data, the opportunities, limitations, requirements, and objectives of government seem to have entirely overlooked. For example, government is successful (not always, but often) with making and monitoring standards for the public good; as elsewhere, why not here? The requirements of government information provision are different than for commercial provision, especially since the government provides goods and services that would not otherwise be profitable. The government does consult with the public and interested parties in making policy, but in some cases it is crucial that government lead and direct developments; the government is not simply another commerical provider of goods and services, driven by consumer interests; the government has a legislative role. Keeping this in mind may change the sorts of proposals that come out from open GLA data
  • There were several discussions about why and how government data should be published. The main points ought to be developed, discussed further, and summarised. Yet, it ought also be pointed out that there is, in the UK, an abundance of information that the government holds about individuals; it is unclear how a ‘firewall’ to protect and promote civil liberties will be set up and maintained; privacy and rights are in fact rather weak in the UK. For example, the NHS is state funded and one might argue certain matters are in the public interest, so open information issues may arise here: will we have ‘disease’ mashups such as there are for broken lamp-posts, but in this case for drug addicts, HIV carriers, swine flu, etc?
  • I am particularly interested in legal reasoning, but this is not something on the agenda with respect to this data.

In any case, there is much of interest here and much to look forward to.

Cheers,
Adam Wyner

Copyright © 2009 Adam Wyner

Podcast with John Sheridan

Friday, August 14th, 2009

A podcast from Nodalities with John Sheridan about e-Government, open data, and linked government data.

Meeting with John Sheridan on the Semantic Web and Public Administration

Tuesday, August 11th, 2009

I met today with John Sheridan, Head of e-Services, Office of Public Sector Information, The National Archives, located at the Ministry of Justice, London, UK. Also at the meeting was John’s colleague Clare Allison. John and I had met at the ICAIL conference in Barcelona, where we briefly discussed our interests in applications of Semantic Web technologies to legal informatics in the public sector. Recently, John got back in contact to talk further about how we might develop projects in this area.

Perhaps most striking to me is that John made it clear that the government (at least his sector) is proactive, looking for research and development projects that make government data available and usable in a variety of ways. In addition, he wanted to develop a range of collaborations to better understand the opportunities the Semantic Web may offer.

As part of catching up with what is going on, I took a look around the web for relatively recent documents on related activities.

In our discussion, John gave me an overview of the current state of affairs in public access to legislation, in particular, the legislative markup and API. The markup is intended to support publication, revision, and maintenance of legislation, among other possibilities. We also had some discussion about developing an ontology of goverment which would be linked to legislation.

Another interesting dimension is that John’s office is one of a few that I know of which are actively engaged to develop a knowledge economy partly encouraged by public administrative requirements and goals. Others in this area are the Dutch and the US (with xml.gov). All very promising and discussions well worth following up on.

Copyright © 2009 Adam Wyner

Participating in One-Lex — Managing Legal Resources on the Semantic Web

Wednesday, July 22nd, 2009

Later this summer, I’ll be participating in the summer school Managing Legal Resources in the Semantic Web, September 7 to 12 in San Domenico di Fiesole (Florence, Italy). This program will focus on several aspects of legal document management:

  • Drafting methods, to improve the language and the structure of legislative texts
  • Legal XML standards, to improve the accessibility and interoperability of legal resources
  • Legal ontologies, to capture legal metadata and legal semantics
  • Formal representation of legal contents, to support legal reasoning and argumentation
  • Workflow models, to cope with the lifecycle of legal documentation

While I’m familiar with several of these areas, I’m using this opportunity to fill in my knowledge in these key areas.

NSF sponsored workshop: Automated Content Analysis and the Law

Wednesday, July 22nd, 2009

I was invited to participate in an NSF ­Sponsored Workshop
 Automated Content Analysis and Law, August 3 and 4 at NSF HQ in Arlington, VA and organised by Georg Vanberg (UNC).

There are two sessions planned. The first session will focus on identifying the theoretical/substantive puzzles in legal and judicial scholarship that might benefit from automated content analysis as well as what data and measurements are required. For the second session, the focus is on the state of automated content analysis/natural language processing, exploring the extent to which current technology is relevant to providing results with respect to issues raised in the first session and what might be needed.

There is an interesting mix of people, with a strong emphasis on legal scholarship bearing on the US Supreme Court and opinion mining. I had an email exchange with Georg, the workshop organiser about this, and we agree that attention ought to turn from the Supreme Court to lower levels of the legal system. I also suggested that participants consider some of the following points which bear on the motives and objectives of these lines of research in terms of who is being served and how the data or conclusions would be used.

Questions for Discussion

  • What sorts of artifacts and technologies (if any) will emerge from the research?
  • How does the research relate to the Semantic Web?
  • What public service does the research provide or support?
  • How does this research relate to:
    • E-discovery
    • Textual legal case based reasoning
    • Legislative XML Markup
    • Other research communities e.g. ICAIL and JURIX

Participants

  • Scott Barclay (NSF) – Barclay@uamail.albany.edu
  • Cliff Carrubba (Emory) – ccarrub@emory.edu
  • Skyler Cranmer (UNC) – skylerc@email.unc.edu
  • Barry Friedman (NYU)- friedmab@juris.law.nyu.edu
  • Susan Haire (NSF) – shaire@nsf.gov
  • Lillian Lee (Cornell) – llee@cs.cornell.edu
  • Jimmy Lin (Maryland) – jimmylin@umd.edu
  • Stefanie Lindquist (Texas) – SLindquist@law.utexas.edu
  • Will Lowe (Nottingham) – will.lowe@nottingham.ac.uk
  • Andrew Martin (Wash U) – admartin@wustl.edu
  • Wendy Martinek (NSF) – wemartin@nsf.gov
  • Kevin McGuire (UNC) – kmcguire@unc.edu
  • Wayne McIntosh (Maryland) – wmcintosh@gvpt.umd.edu
  • Burt Monroe (Penn State) – blm24@psu.edu
  • Kevin Quinn (Harvard) – kevin_quinn@harvard.edu
  • Jonathan Slapin (Trinity College) – jonslapin@gmail.com
  • Jeff Staton (Emory) – jkstato@emory.edu
  • Georg Vanberg (UNC) – gvanberg@unc.edu
  • Adam Wyner (University College London) – adam@wyner.info

Website on ICT and e-Government

Saturday, April 18th, 2009

The following links to an interesting website on ICT and e-Government.

publictechnology.net

This site contains an link to the winners of awards in e-Government:

2007 e-Government Awards Winners

At some later point, I will post some notes about the topics and the winners, highlighting what is not in this list.

Copyright © 2009 Adam Wyner