January 27, 2015
On Tuesday, 27 January 2015 at 15:09:36 UTC, Laeeth Isharc wrote:
>> I cannot speak about small team experiences. Our projects usually take around 30+ developers.
>
> That it is a decent sized team to have to coordinate and it puts emphasis on very different questions.  The context I am thinking of is much leaner - more like special forces than the regular army (I mean in terms of flexibility and need to respond quickly to changing circumstances) - although the sums at stake are likely comparable to larger teams (the area is hedge fund portfolio management).
>
>> In terms of server applications, yes when the applications are deployed usually the memory usage might not be optimal.
>
> For you that is less important, and I suppose that comes from the intrinsic nature of the situation.  You have beefy machines serving many users, I suppose?  I am thinking of a purpose where there are only a handful of users, but the data sets may be larger than we are used to working with, requiring more work than just plain map reduce, and where rapid iteration and prototyping is important.  Also to minimize cognitive overload and complexity.
>
> A friend has written an article on big data in finance for Alpha magazine, and I will post a link here once it has been published.
>  One problem in economics is you have to forecast the present, because the numbers are published with a lag and themselves reflect decisions taken months previously.  But markets are forward looking and discount a future that may be imagined but cannot be understood based only on hard facts.
>
> So we need all the help we can get, particularly during an epoch where things change all the time,  (eurchf fx rate moved forty percent in a day...). Bridgewater have taken the work of Hal Varian and applied it to use media and web analytics to get a good live cut of economic activity and inflation.  Although it is not a tough problem theoretically, people don't actually do as much as they could yet - I think finance is behind tech companies, but they are catching up.  Another fund that my friend writes about uses employee sentiment to pick stocks to be long and short of - they manage a few billion and have done quite well.
>
>> However that is why profiling and language knowledge is required.
>
> Yes, I can imagine, and it sounds like not just that Java is the best option for you, but perhaps the only viable one.  I am curious though - what do you think the memory footprint is as a ratio to C++ before and after fine tuning?  And what proportion of development time does this take?


Actually we use more than just Java.

JVM languages, .NET languages, C++ (only on the realm of JNI/PInvoke/COM), JavaScript

Memory optimizations only take place if really requested by the customer,
from their acceptance tests, which is seldom the case.

Usually one to two sprints might be spent.


>
>> Fine tuning a Java application is no different than other compiled languages, it just requires to know which knobs to turn.
>
> I liked a quote by a certain C++ guru talking about the experience of a Facebook, to the effect that a sensible first draft written in C++ would perform decently, whereas this was not true always of other languages.  Now their trade off between programmer productivity and execution efficiency is extreme, but is this merely an edge case of little relevance for the rest of us, or is it a Gibsonian case of the future being already here, and just not evenly distributed?  I am no expert, but I wonder if the latter may be more true than generally appreciated.
>
>> For example, foreach allocates and a simple for does not, so choose wisely how to iterate.
>
> Would love to hear any other insights you have on this topic.  There ought to be a FAQ on getting along with the GC for fun and profit.

Java ONE, Skills Matter and Microsoft BUILD have performance talks
every now and then.

Then there is the mechanical sympathy blog and mailing list.

http://mechanical-sympathy.blogspot.de/

>
>> The thing that Java still looses in memory management, even in commercial JVMs, is lack of value types since escape analysis algorithms are not very aggressive, but support is being designed and will land partially in Java 9 and Java 10 time frame.
>
> That was my real point - and that it does really matter in some areas, and that these are growing very quickly.  (I appreciate that someone reading my post quickly would see the 15% power thing and focus on that, which was not so much my point - although I am still suspicious of the idea that Java will always keep up with native code without doing a lot of extra work).
>
> People on the Slashdot thread were saying what is the point of D.
>  But the way I saw it, where is the real competition for my use case?  I can't be extravagant with memory, but I still need rapid development and productivity.  We all have a tendency to think that what we know from experience and reading is the full picture, but the world is a big place, and something needs to appeal to someone to grow, not necessarily to oneself personally
>
> The C++ integration is the remaining piece.  Otherwise it is like the old Soviet Union - this is the factory that makes the steel that builds the factory that makes the steel that... so that Vladimir may have a new car.  Ie one spends too much time in roundabout investment before one actually reaps the benefit of higher productivity.
>
>
>> So by Java 10 according to the planned features, there will be value types and even the open source JVM will have some form of JIT cache. Including the large amount of available libraries.
>>
>> As for D, it surely has its place and I am looking forward to language adoption.
>
>
> Out of interest, what do you see as characterising this place (abstracting away from things that are not perfect now, but will probably be fixed in time)?  And in an enterprise environment, what would you use D for today?
>
>
> Laeeth.


To be honest I don't see a place on the enterprise for my type of work.

As a language geek, I hang around in multiple language forums and I like D, because I got to appreciate systems programming with memory safe programming languages back in the 90's.

Our projects are usually based on distributed computing using JVM/.NET stacks, using mainly Oracle and MS SQL Server, web services, HADOOP, Akka(.NET).

Devops like to be able to use the respective management consoles on the servers across the network.

Desktop applications tend to be built on top of Eclipse RCP/Netbeans, WPF or plain Web.

The mobile space is covered with Web applications, Cordova or Xamarin.

This is my little world, but I imagine D being usable in startups or companies not constrained by other language tech stacks.


--
Paulo
January 27, 2015
> Out of curiosity, what is lacking in the current commercial offerings for hedge fund management? Why not use an existing engine?

In the general sense, lots is lacking across the board.  I started a macro fund in 2012 with a former colleague from Citadel in partnership with another company, with the idea that they would provide infrastructure as they had experience in this domain.  I should not say more, but let's say that I was not so happy with my choice of corporate partner.  This experience made me think more carefully about the extent to which one needs to understand and control technology in my business.

One of the things that was striking was the very limited set of choices available for a portfolio management system.  Macro involves trading potentially any liquid product in any developed (and sometimes less developed) market, so it doesn't fit well with product offerings that have a product silo mentality.  One uses a portfolio management system very intensively, so user interface matters.  But very few of the offerings available seemed even to be passable.  We ended up going with these guys who have a decent system because it was spun out of a hedge fund but if you asked me about passable alternatives, I do not know if there are any.  http://www.tfgsystems.com/

There are of course specific challenges for macro and for startup funds that may not be generally true of the domain - it is a big area and what people need may be different.  Larger funds use a combination of third party technologies and their own bits, but I am not sure that everyone is perfectly happy with what they have.  I formerly jointly ran fixed income in London for Citadel, a big US fund, so have some background in the area.  Things changed a lot since then, and I certainly wouldn't want to speak about Citadel.

It's a funny domain, because the numbers are more like a large business, but there are not all that many people involved.  People on the investment side don't necessarily have a technology background, or have the time and attention to spare to hone their specification of exactly how they want things to work.  So one can have a strange experience of on paper being in a situation where one ought to have one's pick of systems, but in practice feeling starved of resources and control.  This is one of the reasons I decided to spend time refreshing my technology skills, even though by conventional wisdom the basic tenets of opportunity cost and division of labour would suggest there is no point.  Things have changed a lot in the past twenty years, and the only way to keep up is to get one's hands dirty now and then.

Again on the resources front - given what happened in 2008, there has been an understandable focus on reporting, compliance, and the like.  It's a surprisingly brittle business because your costs are fixed, whereas revenues depend on performance and assets and investment strategies tend to intrinsically experience an ebb and flow whilst it is human nature to extrapolate performance and investors, being human, tend to chase returns.  So it's not today necessarily the fashion to have a large group of people to develop ideas and tools that might pay off, but where it is hard to demonstrate that they will beforehand.  There has been a cultural change in the industry accompany its institutionalisation, so it's today much more 'corporate' in mindset than it once was, and this shift has not only positive aspects.

In many cases, you can kind of do what you want in theory using Bloomberg.  The problem is that it is closed, and with a restrictive API, so if you want to refine your analysis, that becomes limiting.  But because you can do a lot that way (and it is presented very attractively) it's not so easy to justify rebuilding some functionality from scratch in order to have control.

To take am almost trivial example, Bloomberg offers the ability to receive an alert by email when market hit various price conditions (or certain very basic technical analysis indicators are triggered).  That's valuable, but not enough for various reasons: one needs to maintain the alerts by hand (last I checked); I don't trust email for delivery of something important; and I want to be able to consider more complex conditions.  One could do this in a spreadsheet, but that's not in my opinion the way to run a business.  Python is fine for this kind of thing, but I would rather engineer the whole thing in a better way, since the analytics are shared between functions.

Or to take another example, charting and data management platforms for institutional traders remain unsatisfactory.  It's not easy to pull data in to Bloomberg, and to do so in an automated way where your data series are organized.  One wants to have all the data in one place and be able to run analyses on them, and I am not aware of a satisfactory platform available for this.  Quite honestly, the retail solutions are much more impressive - it's just that they don't cover what one needs as a professional.  By building it oneself, one has control and can work towards excellence.  The combination of incremental improvements, small in themselves, is underestimated in our world today as a contribution to success.

> Also, why D? Why not use a language or platform designed for scalability and distributed computing like http://chapel.cray.com/ ?

Pragmatically, I am an old C programmer, and there is a limit to how much I can learn in the time available.  It seems to me I can do everything I need in D in a way that is scalable for practical purposes.  Some of what I want to do is totally straightforward scripting, and some is more ambitious.  It is nice to be able to use a single language, if it's the right tool for the job (and if not, then interoperability matters).  If sociomantic (and that advertising company linked to in the blog post from a while back about using D for big data) can do what they do, I can't imagine it will be limiting for me for a while.  I will check it out, but there is a beauty to starting with the smallest useful version, and knowing that you can scale if you need to.

I recognize this reply is meandering a bit - since the major topic is use of D for big data in finance, whereas I am touching on a whole host of applications where I see it being rather useful.



Laeeth.
January 27, 2015
On Tuesday, 27 January 2015 at 19:27:43 UTC, Laeeth Isharc wrote:
> One of the things that was striking was the very limited set of choices available for a portfolio management system.  Macro involves trading potentially any liquid product in any developed (and sometimes less developed) market, so it doesn't fit well with product offerings that have a product silo mentality.  One uses a portfolio management system very intensively, so user interface matters.

I have to admit that I know very little about hedge funds, so this is all quite new and intriguing for me (and therefore pique my interest! ;^). I am using Google Cloud for creating App Engine web apps, but I have wanted to experiment with Google's cloud computing offerings for a while. Do you think that Compute Engine and Big Query would be suitable for your needs? Or is it required that you have all your data on site locally? Google has pretty good stability (SLA), but I guess they were very slow for a few hours during the olympics or so a couple of years ago (some load balancing mechanism that went bananas).

> There are of course specific challenges for macro and for startup funds that may not be generally true of the domain - it is a big area and what people need may be different.  Larger funds use a combination of third party technologies and their own bits, but I am not sure that everyone is perfectly happy with what they have.

So, basically there might be a market for tailoring solutions so that client can gain strategic benefits?

> that they will beforehand.  There has been a cultural change in the industry accompany its institutionalisation, so it's today much more 'corporate' in mindset than it once was, and this shift has not only positive aspects.

Ah, I sense you are going against the stream by getting your hands dirty in a DIY way. Good! :)

> becomes limiting.  But because you can do a lot that way (and it is presented very attractively) it's not so easy to justify rebuilding some functionality from scratch in order to have control.

So Bloomberg have basically commoditized the existing practice, in a way, thus reinforcing a particular mindset of how things ought to be done, perhaps? And maybe you see some opportunities in doing things differently? :-)

> are triggered).  That's valuable, but not enough for various reasons: one needs to maintain the alerts by hand (last I checked);

For legal reasons?

> in my opinion the way to run a business.  Python is fine for this kind of thing, but I would rather engineer the whole thing in a better way, since the analytics are shared between functions.

Not sure what you mean by the "functions", do you mean technical computations or people (like different functional roles)?

> automated way where your data series are organized.  One wants to have all the data in one place and be able to run analyses on them, and I am not aware of a satisfactory platform available for this.

How large are the datasets?

> needs as a professional.  By building it oneself, one has control and can work towards excellence.  The combination of incremental improvements, small in themselves, is underestimated in our world today as a contribution to success.

Yes, and you can also tailor the interface to the user, so professionals can eventually get more done or be less frustrated by getting rid of the clutter. Or in some cases where I try to make the interface so simple that no learning (and therefore confusion) is necessary, which is kind of important for functions that are used seldom. But it sounds like you are creating tools for yourself, so that might not apply in your case?

> Pragmatically, I am an old C programmer, and there is a limit to how much I can learn in the time available.  It seems to me

Sound like D might be a good starting point for you, an incremental upgrade from C.

> I can do everything I need in D in a way that is scalable for practical purposes.  Some of what I want to do is totally straightforward scripting, and some is more ambitious.  It is nice to be able to use a single language, if it's the right tool for the job (and if not, then interoperability matters).  If sociomantic (and that advertising company linked to in the blog post from a while back about using D for big data) can do what they do, I can't imagine it will be limiting for me for a while.  I will check it out, but there is a beauty to starting with the smallest useful version, and knowing that you can scale if you need to.

If you need very high performance on a single CPU then you probably need a compiler that will generate good SIMD code for you, but I suppose you could try out a tool like Intel's experimental vectorizing compiler https://ispc.github.io/ or something else that can vectorize and link it in if D is too slow for you.

> I recognize this reply is meandering a bit - since the major topic is use of D for big data in finance, whereas I am touching on a whole host of applications where I see it being rather useful.

You might find D a bit lacking on the SIMD side, with AVX you can basically boost performance with a factor of 5-6x compared to non-vectorized code, but maybe D will benefit from the auto vectorizing support that is being added to LLVM for Clang.

How do you plan to do the user interface? HTML5?
1 2 3
Next ›   Last »