Technology Manifesto (Or, The World Can't Wait For This)
Elegance, Simplicity: publishing and learning should be a pleasure rather than a chore
nor it should not be expensive. While searching and browsing the web is fun, when we
are focused on a learning task we want as much relevant information at our disposal.
Searching and browsing takes effort and it distracts us. Why can't we just "Learn it!"?
Dynamic Content: content constantly changes, so any relationship between
two pieces of content can't be assumed to be static (e.g., an URL) but can
mediated by intelligent, dynamic "links."
Chunking: the learning curve to contributing to and using each aspect of the
ecology (both instrinsic components for developers
and user interfaces for end-users) is measured in minutes. During design, constantly break down large
components into the smallest usable components possible.
Distributed Federation of Service-based Components: eventually hundreds of thousands of service-oriented
components linked by messaging.
This disassociates the performance of a component from its runtime environment: server, multicore CPU,
storage, memory, and network and IO speed.
The proportion of processes running these components, though proportionally tiny when compared to data, will
grow. Avoid plugin schemes, and monolithic
tools, applications. We are building an ecology—not a "system." Within a couple of decades, libraries
(e.g., an XML parser) will tend to be service-based components rather than linked libraries.
Adaptive Interfaces and Data Stores: Proxy Components: schemas and protocols are always adaptive and
easily transformable. If a transformation is necessary,
a separate component can provide it, without requiring changes in producer and consumer components with
Also, the store is insular to a component and does not impose external structure. Suitable technologies are
NOSQL (e.g., HBase, MongoDB, Cassandra)
and in some cases SQL, but they should be thought of as local file systems.
Intelligent Use of Open Source and Standards: Standards should never get in the way of progress.
Supporting certain standards can require much more work than is
necessary to implement a component. We can think of component design and implementation as scoped by its
Adaptive interfaces and stores allow us to change and grow protocols as the diversity of the component's
consumers grow. A solution is to
build proxies to standards; themselves adaptive components that act as façades to non-compliant components.
This makes the
implementation of standards a separate concern from developing the infrastructure that is the point of the
whole thing. In other
words, make standardization a specialization that can be delegated to experts.
Quick Design and Implementation: Ideas are expressed as small, tested, efficient, bullet-proof,
well-documented, versioned, open source components immediately available to the public.
Identity: Identity is polymorphic and probabilistic. It's polymorphic because for a naively conceived
identity (e.g., myself as a person) there are in fact multiple "identities" (e.g., myself as a husband, as a
Probabilistic because we cannot do better than to correlate an aggregate of attributes with any one
identity. For example, consider
OCLC's FRBR services. This approach at
identifying (classifying) books and authors, though an improvement over WorldCat, in
its present incarnation does not, in its implementation, take into account the fact that automated
classification is error-prone.
Permissive Licensing: All components to be licensed to permit commercial use but require
modifications to be contributed back. It is important
to encourage commercialization to spread benefits and engage the best and most diverse minds that we can.
Implementation-agnostic: components implemented in a computer language, Operating System, hardware,
and geographical location of choice.
Rich Analytics Framework: Federated analytics are typed (invites evaluation and experimentation),
versioned, and can run anywhere as a service.
Component Repositories: easy to get, update, manage, and trust a range of contributed
components and UIs from distributed repositories.
High Performance: user interface services respond quickly to requests.
Scalable: a federation of components allows us to select the best versions that meet a specific need
and budget. Horizontal scalability is guaranteed by a federation
of components of the same type.
Available: Mechanism for automatic monitoring of and failovers to redundant (backup) services.
Professionals and users can't be at the mercy
of glitches in a data center.
Evolutionary Schemas and Semantics: schemas dynamically created, updated, and embedded (think, Just-In-Time
Semantics). Automated method for
proposing and promoting semantics (e.g. relationship names and meanings) using a manually adjustable
reputation-based approach. An example
of this approach is the AVRO serialization system, whereby the schema is transmitted with the data. There
of course significant challenges in synchronizing the schemas, but the gains in productivity, security, and
are enormous, comparable to the acceleration of RPC applications (web) when we moved from CORBA, SOAP, and
then to the REST architecture.
Reuse: Each type of service component is small enough that it can be used as a working example of how
to build similar components.
Cloud Support: the "cruft" are ready-made scripts for deployment and use of components in cloud
providers (e.g., Amazon Web Services).
My focus is on what I know and interests me best: architecting intelligent authoring technologies for intermedia
(intertextual + media), with emphasis on the archaic and classical Latin, Greek, Hittite, and Sanskrit corpora.