2011. 8. 30. 11:08 Market Data

Time series database

http://en.wikipedia.org/wiki/Time_series_database

time series database server (TSDS) is a software system that is optimized for handling a time series. In this context, a time series is an associative array of numbers indexed by a datetime or a datetime range. These time series are often called profiles or curves, depending upon the market. A time series of stock prices might be called a price curve, or a time series of energy consumption might be called a load profile. Despite the disparate naming, the operations performed on them are sufficiently common as to demand special database treatment.

TSDSs simplify the development of software with complex business rules in a wide variety of sectors. Queries for historical data, replete with time ranges and roll ups and arbitrary time zone conversions are difficult in a relational database. Compositions of those rules are even more difficult. This is a problem compounded by the free nature of relational systems themselves. Many relational systems are often not modelled correctly with respect to time series data. TSDS on the other hand impose a model and this allows them to provide more features for doing so.

Ideally, these repositories are often natively implemented using special database algorithms. However, good performance has also been obtained by storing time series as binary large objects (BLOBs) in a relational database or by using a VLDB approach coupled with a pure star schema. These work best when time is treated as a fact, not a dimension.

Contents

 [hide]

[edit]Overview

The TSDS allows users to create, enumerate, update and destroy various time series and organize them in some fashion. These series may be organized hierarchically and optionally have companion metadata available with them. The server often supports a number of basic calculations that work on a series as a whole, such as multiplying, adding, or otherwise combining various time series into a new time series. They can also filter on arbitrary patterns defined by the day of the week, low value filters, high value filters, or even have the values of one series filter another. Some TSDSs also build in a wealth of statistical functions.

For example, consider the following hypothetical "time series" or "profile" expression:

SELECT nymex/gold_price * nymex/gold_volume

To analyze this, the TSDS would join the two series nymex/gold_price and nymex/gold_volume based on the overlapping areas of time for each, multiply the values where they intersect, and then output a single composite time series.

Obviously, more complex expressions are allowed. TSDSs often allow users to manage a repository of filters or masks that specify in some way a pattern based on the day of a week and a set of holidays. In this way, one can readily assemble time series data. Assuming such a filter exists, one might hypothetically write

SELECT onpeak( cellphoneusage )

which would extract out the time series of cellphoneusage that only intersects that of 'onpeak'. Some systems might generalize the filter to be a time series itself.

This syntactical simplicity drives the appeal of the TSDS. For example, a simple utility bill might be implemented using a query such as:

SELECT MAX( onpeak( powerusagekw ) ) * demand_charge;
 
SELECT SUM( onpeak( powerusagekwh ) ) * energy_charge;

TSDS also generally have conversions to and from specific time zones implemented at the server level.

[edit]Example

A workable implementation of a time series database can be easily deployed in a conventional SQL-based relational database provided that the database software supports both binary large objects (BLOBs) and user-defined functions. SQL statements that operate on one or more time series quantities on the same row of a table or join can easily be written, as the user-defined time series functions operate comfortably inside of a SELECT statement. However, time series functionality such as a SUM function operating in the context of a GROUP BY clause cannot be easily achieved.

[edit]See also

'Market Data' 카테고리의 다른 글

Xenomorph - TimeScape QL+  (0) 2011.08.30
Xenomorph - High Frequency Data Analysis  (0) 2011.08.30
Xenomorph - Tick/Time Series database; TimeScape  (0) 2011.08.30
Informix TimeSeries DataBlade  (0) 2011.08.30
A Conversation with Arthur Whitney  (0) 2011.08.27
Posted by karlsen
http://www.xenomorph.com/downloads/whitepapers/ql-plus/

TimeScape QL+

Data. Decisions. Together.

This white paper describes TimeScape QL+, a new query language designed for financial markets in order to bring users, data and decision-making closer together. This language is easy to understand, easy to extend, provides powerful support for vector arithmetic of intraday and historic data and deals with data issues that are specific to the financial markets.

PDF

View complete TimeScape QL+ white paper PDF.
(This document requires the free Adobe/Acrobat Reader.)
16 pages, 420KB

Introduction

More Data, Less Time

Why do we need another query language? Good question. The usage of SQL is both prevalent and effective throughout financial markets IT and the software industry as a whole. However, there are many people who need to access, manipulate and analyse data who are not technical experts in SQL but are experts in their own particular field of business.

Nowhere is this more apparent than within the financial markets - where traders, fund managers, quants, research analysts, risk managers and other business staff are under constant competitive pressure to analyse larger and larger volumes of increasingly complex data in less and less time.

Users and Technologists Think About Data Differently

Business users in financial markets tend to think about the data they need in terms of the financial instruments they trade - such as equities, bonds, options etc. Traders do not think of real-time data, time series data, tick data, static data and calculated data very differently - to them all of this data is needed and relevant to the business decisions that are being made.

However, this same data is generally split out across many tables when stored in a typical relational database implementation, rendering the data as less than user-friendly to access. The implementation gets more complex if different asset classes are stored in different databases. Even within one asset class, static terms and conditions data may be stored in different systems from time series data, tick data and calculated/derived data.

In the absence of further end-user tools to analyse the data, the trader or risk manager must understand both the SQL programming language and the table structure/architecture implemented for the databases in question. Even if the business user is capable of achieving these two things, they often do not have the time to do anything more than avoid using the database by downloading equivalent data from Open Bloomberg into Microsoft Excel.

Technologists can, of course, go some way to addressing the above issue through the provision of read-only views and wrapper functions (in the form of stored procedures) to hide some of the complexity from the end-user. However, as data demands from the business expand, and table complexity increases, these procedures can become computationally expensive and increasingly difficult to manage, maintain and change. As a result, business users continue to be heavily reliant on technologists to deliver and so the unproductive cycle continues.

White paper contents

  • More Data, Less Time
  • Users and Technologists Think About Data Differently
  • TimeScape QL+: Designed for Financial Markets
  • Functional Overview of TimeScape QL+
  • Some Examples of TimeScape QL+ in Action
    • Example #1 – Loading a Price History for an Equity
    • Example #2 – Calculating Historic Volatility of an Underlying
    • Example #3 – The ‘.?’ Statement and Context Sensitive Help
    • Example #4 – Data Rules and Bond Spread Analysis
    • Example #5 – Tick Data Analysis, Data Frequency and VWAP
    • Example #6 – Multiple Instrument Analysis
    • Example #7 – Vector Arithmetic, Spread Analysis and VAR
    • Example #8 – Adding Your Own Functions
    • Example #9 – Adding Your Own Objects
  • Future Directions
  • Summary
differently : 다르게, 같지 않게, 따로, 별도로

'Market Data' 카테고리의 다른 글

Time series database  (0) 2011.08.30
Xenomorph - High Frequency Data Analysis  (0) 2011.08.30
Xenomorph - Tick/Time Series database; TimeScape  (0) 2011.08.30
Informix TimeSeries DataBlade  (0) 2011.08.30
A Conversation with Arthur Whitney  (0) 2011.08.27
Posted by karlsen
http://www.xenomorph.com/downloads/whitepapers/high-frequency-data/

High Frequency Data Analysis

Considered decision-making with TimeScape

This paper illustrates how Xenomorph’s real-time analytics and data management system, TimeScape, enables extremely rapid and extensible analysis of tick and intraday timeseries data delivering competitive advantage in pre- and post-trade decision support.

PDF

View complete High Frequency Data Analysis white paper PDF.
(This document requires the free Adobe/Acrobat Reader.)
20 pages, 860KB

Introduction

A Time of Change and Opportunity

Data management in financial markets are being driven through a period of fundamental change. Trade volumes are increasing exponentially as electronic execution delivers faster trading with ever-tighter margins. The proprietary algorithms used in algorithmic and statistical arbitrage trading are becoming more complex. Developments in areas such as credit theory are establishing market relationships that motivate more complex cross-asset trading strategies. Regulations such as MiFID and Regulation NMS are pushing the whole industry towards better and more transparent execution, but are also fundamental drivers behind both huge business change and dramatically increased data volumes. All of these factors are combining to provide both profit and cost incentives to move away from single asset class data silos.

Looking at data management from a trader’s perspective, then a decade ago many practitioners were content with analysing end of day historic data for strategy back-testing and instrument pricing purposes. It should be said that they quite possibly had no choice from a technological perspective; the capture, storage and analysis of intraday data volumes even then was challenging, especially at a time when the relational database was still a relatively new technology. Given however that derivative pricing margins were wider and statistical arbitrage was profitable using end of day prices, then there was also little incentive to store and analyse intraday tick and high frequency data. Much tighter trading margins, cross-asset trading and improved technology have changed traders’ perceptions of what is required and what kind of analysis is possible with high frequency intraday data.

Risk management has previously not been greatly concerned with intraday data. As many risk managers will confirm, obtaining clean data for end of day risk measurement is challenging enough. Risk measurement techniques such as monte-carlo or historical simulation VaR require large amounts of historical data and are calculation intensive. Large data universes or poor implementation may mean that it is challenging to attempt to run these techniques as an overnight batch, let alone perform the calculations in real or near real-time. Increased intraday trading exposure, better understanding of intraday market behaviour and recent regulatory requirements concerning data transparency and data quality are driving risk managers towards more analysis of tick by tick and intraday data.

The changes above require systems that can adapt to the pace of change, delivering high performance analysis even when the quantity of intraday data being analysed is massive. This paper describes how Xenomorph’s real-time analytics and data management system, TimeScape, has been designed to meet these current challenges and to deliver competitive advantage in pre- and post-trade decision support.

White paper contents

  • A Time of Change and Opportunity
  • Data Transparency – Increased Productivity for All
  • Tick Capture – Flexibility and Ease of Use Together
  • Tick Storage – Real-Time, Intraday and History All in One
  • Data Access - for Both Traders and Technologists
    • Loading Intraday Data with Functions
    • Loading Intraday Data with TimeScape QL+
  • Data Validation – Why Waste Half Your Time?
    • Data Frequency and Time Snapping
    • Data Filling, Aligning and Loading Rules
    • Real or Interpolated Data?
  • Data Analysis – From Backtesting to Best Execution
    • Chaining Analytical Functions
    • Intraday Time Period Analysis
    • Beyond the Spreadsheet
  • Event Processing – Real-Time Analysis Automation
  • Your Proprietary Advantage
  • Summary

'Market Data' 카테고리의 다른 글

Time series database  (0) 2011.08.30
Xenomorph - TimeScape QL+  (0) 2011.08.30
Xenomorph - Tick/Time Series database; TimeScape  (0) 2011.08.30
Informix TimeSeries DataBlade  (0) 2011.08.30
A Conversation with Arthur Whitney  (0) 2011.08.27
Posted by karlsen
http://www.xenomorph.com/solutions/data-management/tick-capture/

TimeScape provides a powerful database engine (TimeScape XDB) for managing vast quantities of time series data.

This data can be anything from simple numerical data (e.g. daily closing prices or rates) and intraday (tick-by-tick) data, to more complex data that may or may not have a time dimension such as dividend projections, instrument relationships, index/basket/portfolio compositions, curve compositions and volatility surfaces states.

Unlike many other time series database systems, TimeScape manages these types of data within one consistent and highly efficient database system that is scalable and easily customised to meet the ever increasing demands of the business.

In addition, it makes this data easily accessible to end-users, developers and systems alike via its powerful business orientated analysis language called TimeScape QL+.

This simple to use language has been specifically designed to bridge the gap between business users and technologists, without compromising performance. It allows highly sophisticated analysis to be constructed and utilised from the user’s environment of choice, whether that is Microsoft Excel, a TimeScape application or one that an organisation has built themselves using the TimeScape development toolkits.

In particular, TimeScape’s Tick / Time Series database:

  • Comes with a user-orientated set of desktop tools for interacting, visualising and analysing the data stored within it
  • Supports a wide variety of data types (including numbers, date-times, logical, text, lists, matrices, references, formulae, spreadsheet inside, binary objects)
  • Supports static, semi-static (weekly, monthly), daily and tick data frequencies
  • Has a very flexible data model that co-supports multiple instrument identifier systems (Reuters, Bloomberg, ISIN, Internal security systems etc.) as well as side-by-side support for multiple sources of data (Reuters, Bloomberg etc.)
  • Includes a sophisticated security model that allows data access to be controlled down to a particular instrument attribute level (for example, limiting write access to US Equity Closing prices)
  • Is supported either in a proprietary format or as part of SQL Server 2005/8
  • Is highly scalable, with benchmarks having been completed recently on a 12Tb database containing over 750 billion ticks of data with no noticeable loss in performance

Email us to request more information about how TimeScape Tick/Time Series Database technology can help your organisation.

allows : 허락하다, 주다, 인정하다, 허가하다, …하게 두다

'Market Data' 카테고리의 다른 글

Xenomorph - TimeScape QL+  (0) 2011.08.30
Xenomorph - High Frequency Data Analysis  (0) 2011.08.30
Informix TimeSeries DataBlade  (0) 2011.08.30
A Conversation with Arthur Whitney  (0) 2011.08.27
An Interview with Arthur Whitney  (0) 2011.08.26
Posted by karlsen
http://www-01.ibm.com/software/data/informix/blades/timeseries/

The IBM® Informix® TimeSeries DataBlade™ module greatly expands the functionality of your database by adding sophisticated support for managing time-series and temporal data.

A "time series" is any set of data that is accessed in sequence by time and can be processed and analyzed in a chronological order. Key features of the Informix TimeSeries DataBlade include:

Posted by karlsen
2009/4/20
A Few Well-Chosen Words about Programming Languages from a Long-Time Designer


Posted by karlsen
Kx CEO and Developer of Kx Technology
2004년 1월 4일
1958년생

모건스탠리에서 워크스테이션 버젼의 좋은(빠른,...) APL이 없었기 때문에, APL을 대체하는 A+를 만들었다. 이것으로 과거 데이터 분석을 하는데 사용했다. 매일 $100 밀리언의 주식을 사고파는 트레이딩 시스템을 구현하는데 사용했다. 그 후 모든 곳에 사용했다. 운영체제 말고는 다른 소프트웨어는 없었다. 1993년 A+보다 두배정도 더 좋은 생산성을 가진 무언가를 만들고자 모건스탠리를 떠났고 Kx를 시작했다.

k 언어를 만들었으며 A+보다 훌륭했다.
모든것을 단순하게 만드는 방법을 발견했다.
언제나 프로그램을 더욱 짧게 만드는 것에 관심이 있다.

APL를 가지고 데이터베이스를 구축한적이 있고, 나중에는 A+로 했었다. 당시 SQL에 호기심이 있었고, k 언어와 잘 맞을것 같았다. k는 이미 데이터베이스를 다루고 있었지만, SQL은 관계 테이블에서 아주 잘 알려진 몇몇 오퍼레이션을 수행했다. 그래서 k 상에 SQL 레이어를 올리는 것이 유용할 것이라 생각하였다. 1998년 이것을 kdb database로 ksql과 함께 릴리즈했으며 time-series extension을 제공했다.

ksql은 relational과 time-ordered data를 더욱 쉽게 조작할 수 있도록 만들었다. sql만으로는 price deltas at various times를 조작하는데 어려움이 있다.

k는 다른 relational databases에서의 프로그래밍 언어와 일반적인 프로그래밍에서 사용되는 것은 같으나, 다른 언어가 cobol과 같이 low level이면서 장황하지만, k는 high level이다.

kdb+는 k와 ksql을 하나의 언어로 합친 것이다. 고객은 더욱 작고 빠른 프로그램이 될 것이다. list의 파워풀한 algebra와 dictionaries(associative lists)를 제공한다. lists와 dictionaries의 조합은 relational tables로 될 것이다.

만약 C++ 프로그래머라면 ksql과 같은 vector language를 배우는데 최선의 방법은? 가장 어려운 부분은 테이블을 조작하는 것이다 - 100 빌리언 row가 있다하더라도 - 하나의 오브젝트인것처럼 하라. 이것은 한번에 하나의 아이템을 가지고 작업하는 언어에 익숙한 많은 사람들에게 추상화 생각의 jump이다. kdb+는 기존에 익숙한 loops가 없다.

kdb+는 여러 언어를 위한 platform이다. 핵심언어는 k이다. ksql은 time series extensions를 가진 프로그래밍 & 데이터베이스 언어이다. 다음 언어는 ansi/sql이다. 그 전에 나는 more functional types와 game-playing operators를 추가하는 것을 생각하고 있다. 이것은 프로그램을 더욱 짧게 만들어줄 것이다.


Posted by karlsen

'Market Data' 카테고리의 다른 글

A Conversation with Arthur Whitney  (0) 2011.08.27
An Interview with Arthur Whitney  (0) 2011.08.26
Universal Market Data  (0) 2011.08.25
TIP. Wallstyouth, May 25, 2007  (0) 2011.08.25
Tick(Time Series) Database Discussion  (0) 2011.08.12
Posted by karlsen

2011. 8. 25. 15:42 Market Data

Universal Market Data

상품수 : 약 30k
레코드 사이즈 : 몇 백 바이트
건수: 몇 억 건에서 몇 십억 건

조회조건 : "Get ticks for Instruments {i1, ..., in} from time t1 to t2 order by time, instrument"

Table Design
1. OBT : more scalable, less overhead, indexing logic 필요
2. OTPI

'Market Data' 카테고리의 다른 글

An Interview with Arthur Whitney  (0) 2011.08.26
A Data Structure for Fast Extraction of Time Series from Large Datasets  (0) 2011.08.25
TIP. Wallstyouth, May 25, 2007  (0) 2011.08.25
Tick(Time Series) Database Discussion  (0) 2011.08.12
TickBase  (0) 2011.08.11
Posted by karlsen
Opensource DB on steriods MonetDB - http://monetdb.cwi.nl/ not very good for time series data as k but its much better than most of the opensource stuff available today 

Checkout HDF5 its opensource but API only - http://hdf.ncsa.uiuc.edu/whatishdf5.html it works very well with Esper Esper - 

Cheaper than kdb but comes very close in speed Time Series Database - Xenomorph Database (XDB)

'Market Data' 카테고리의 다른 글

A Data Structure for Fast Extraction of Time Series from Large Datasets  (0) 2011.08.25
Universal Market Data  (0) 2011.08.25
Tick(Time Series) Database Discussion  (0) 2011.08.12
TickBase  (0) 2011.08.11
TREP-VA for Quantitative Trading  (0) 2011.08.11
Posted by karlsen
이전버튼 1 2 이전버튼

블로그 이미지
Pricing, hedging, risk-managing a complex derivative product
karlsen

태그목록

공지사항

Yesterday
Today
Total

달력

 « |  » 2024.11
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함