# Understanding our data

Let's better understand what have as data, how is it stored, fetched and surfaced. Would like to divide the data we have/consume into 3 categories, based on their type, source and behavior in the system.

Note: Data from a single source, might have parts of each belonging to more than one categories defined here.

  • Just-in-Time Data
  • Events Real-time Data
  • Graph State Data

# Just-in-Time Data

This category of data refers to data fetched based on User generated queries (on TJ client or extension) from 3rd party APIs.

  • There is no good way have an index of the data we fetch.
  • The Schema of this data is very loosely maintained in our system, mostly only to map it to a UI widget.
  • We do not persist the content/details of this data unless user adds it to the Graph State or does some action based on these.
  • Custom Rules for Value Exchange are possible for these, however due to lack of persistent schema, thorough testing is recommended.
  • Well defined VERBs might not be available for when this data gets surfaced or consumed.
  • UI is powered partially with this data.

Example: Wolfram Alpha Answers

# Event Real-time Data

This category of data refers to the event stream generated by all actions performed by user or bot on TJ ecosystem.

  • This is a time-series data set.
  • Event stream if piped through zelda, for type validation, and sanitation.
  • This data is indexed based on the actor and verb.
  • This data gets stored in out preferred document DB (mongo)
  • We only have a direct READ privileged to this data.
  • Events can only be CREATED by the system in response to some action/cron/another event.
  • Custom rules for value Exchange are recommended to be based on events.
  • Schema of these events is very well maintained is under constant maintenance by our team.
  • Does Not powered the UI directly ever.

Examples:

  • Event of page Visted,
  • Quest started by a user,
  • Daily Reporting pipeline run completed

# Graph State Data

This category of data refers to the Final/Current state of the TJ-ecosystem. ( Also know as Factual Internal Information )

  • This is stored as rdf triples/graph database partly in Neo4j and in Mongo.
  • Full CRUD privileges - based in Authorization and Access - is surfaced to users.
  • Custom rules for value Exchange can not be built based on this data, however, referring to this state is always possible.
  • Schema of the graph is very well maintained is under constant maintenance by our team.
  • UI is powered mostly with this data.

Examples:

  • Notes made by a user
  • My Friends
  • Bookmarks
  • ThoughtMaps /thought transcripts