hate these ads?, log in or register to hide them
Page 121 of 121 FirstFirst ... 2171111118119120121
Results 2,401 to 2,407 of 2407

Thread: <xml><thread isTerrible="true" isUnderstandable="false" /></xml> - Coding help thread

  1. #2401
    Donor erichkknaar's Avatar
    Join Date
    April 10, 2011
    Posts
    8,945
    Quote Originally Posted by Frug View Post
    Quote Originally Posted by elmicker View Post
    Quote Originally Posted by erichkknaar View Post
    I assume your comment about Apache is really pointing at the elephant in room with any JVM based big data architecture, which is the unmitigated shit that is zookeeper
    While I doubt that's what he was on about, what's your problem with ZK?
    Yeah that's absolutely not what I was on about. I'm not ops. At best I dip into a dev ops role when needed. Zookeeper's never been a problem for me because I don't manage it, I just know it's a thing that's there juggling servers or something.

    As a dev I'm a consumer of the services once they're up. So, for example, apache servers (ew) and solr, which is basically a shitty elasticsearch with shitty enterprisey configuration that nobody wants to use. We do use it because it works, and if massive xml files for configuration are your thing, you won't get my distaste. An ELK stack, on the other hand, is actually fun. It may be that Solr wins some performance contest or has all the MEPS, but I doubt it would matter to us. Similarly from my perspective Rabbit's advantages outweigh SNS as I see the latter introducing complexity I have to manage and an SDK that's worse than the AMQP library I have used.

    I figured the reason Solr is what it is, is a combination of semi neglected foss management and a b2b audience that that accepts what's handed to them. So I figured kafka is the same. This is speculation on my part, I fully expect someone to tell me I'm wrong for xyz reasons, or lazy for wanting things to be easy. Also the latest Solr is better, but still. Bleah.
    Kafka is a bit more unique than solr vs elasticsearch (elasticsearch).

    In type of server, it's more of a properties file, unix server style of java server layout, and can easily be integrated into linux services or docker containers, or whatever.

    In practice, you deploy a cluster, with probably at least three nodes for a production deployment (sometimes a few more). It will at least need two processes, kafka and zookeeper, even on a single node deployment. Its a binary protocol for producers and consumers, and some rules about how to partition messages across the cluster you want to enqueue. It's main, fairly novel feature, is the fact that all the messages to a topic across a cluster are indexed and searchable in parallel, and can either be persistent, or windowed by size or age, and this allows many interesting ways to process data at scale, as the consumers keep a bookmark, so can go back and rewind, and pick back up where they crashed, and all other kinds of semantics that are useful. It's mostly about scale.

    The last point is why you put up with the complexity. Kafka came out of linkedIn. It is fairly (or will be until soon, cause everyone wants a "not java" kafka) unique, but as I said, I'm prejudiced against zookeeper so I've been figuring out a new approach to the same problem. Different compromises for different purposes. Stream processing is really where all of this comes into it's own, so streamdb ksql, things like leveldb implementations built on it are really excellent, highly scalable solutions to what is inevitably a data ingestion problem, but can often then offer various 'real-time' (heh) insights into data.

    Quote Originally Posted by Frug View Post
    it ended up consuming ops cycles in a way that completely outstripped its position as a dependency to a tool we want to use.
    We had one main ops guy and one devops guy able to keep it up with no problems while also supporting mesos, consul, our servers, and whatever else. I was there for a couple of years with this setup.
    Things like transaction volume have a way of making some software completely hard to use in some cases where they would be very easy in others. A single mysql server directly connected to a page with no caching or anything is the obvious canonical example. We had also been using it, in some sense, since 2007 or so. Once our Kafka cluster had been in production for a bit, and the entire business grew, which increases clients, etc, etc, at some point we hit a point where zookeeper began to get flaky. I mean, the team responsible handle a lot of other stuff too. My point was that zookeeper began to consume a disproportionate amount of time, from a causing problems and "Blinking lights" point of view.

    E: It's also worth pointing out that most of my corporate hadoop experience was with a certain vendor who don't really use the Apache implementation, so different issues but no zookeeper.
    meh

  2. #2402

    Join Date
    April 14, 2011
    Posts
    5,505
    Nah the problem there is that Solr is ooollld. The first release of Lucene was 1999 and is something of a Doug Cutting Special in API design terms (see also: Avro, MapReduce). Likewise Solr is from the deep dark ages of the early 2000s, while Yonik is a fucking genius there's no getting away from the XML.

    That said it is a similar story. The only functional area where ES will beat Solr is rapidly evolving data, which almost no one uses it for. Otherwise Solr is just a better product. There are some ease of use areas where ES is nicer*, but that stops mattering when you're seriously looking at needing dozens of heterogenous machines in a tiered config to meet your throughput needs and your consistency model basically stops existing all because web developers don't want to work with schemas.

    *And frankly not that many, Solr 6/7 have closed almost all the gaps.

    Quote Originally Posted by erichkknaar View Post
    ...My point was that zookeeper began to consume a disproportionate amount of time, from a causing problems and "Blinking lights" point of view.
    It's worth noting that as of the newer releases Kafka has almost eliminated the ZK dependency. It's still there but the load is on the order of 1% of what it used to be. Unless you insist on using it for tracking consumer offsets like some kind of dinosaur.
    Last edited by elmicker; October 29 2017 at 07:55:51 PM.

  3. #2403
    Donor erichkknaar's Avatar
    Join Date
    April 10, 2011
    Posts
    8,945
    Quote Originally Posted by elmicker View Post
    Nah the problem there is that Solr is ooollld. The first release of Lucene was 1999 and is something of a Doug Cutting Special in API design terms (see also: Avro, MapReduce). Likewise Solr is from the deep dark ages of the early 2000s, while Yonik is a fucking genius there's no getting away from the XML.

    That said it is a similar story. The only functional area where ES will beat Solr is rapidly evolving data, which almost no one uses it for. Otherwise Solr is just a better product. There are some ease of use areas where ES is nicer*, but that stops mattering when you're seriously looking at needing dozens of heterogenous machines in a tiered config to meet your throughput needs and your consistency model basically stops existing all because web developers don't want to work with schemas.

    *And frankly not that many, Solr 6/7 have closed almost all the gaps.

    Quote Originally Posted by erichkknaar View Post
    ...My point was that zookeeper began to consume a disproportionate amount of time, from a causing problems and "Blinking lights" point of view.
    It's worth noting that as of the newer releases Kafka has almost eliminated the ZK dependency. It's still there but the load is on the order of 1% of what it used to be. Unless you insist on using it for tracking consumer offsets like some kind of dinosaur.
    I know. That said, I have't looked at it since April, so...
    meh

  4. #2404
    Frug's Avatar
    Join Date
    April 10, 2011
    Location
    Canada
    Posts
    13,054
    Quote Originally Posted by elmicker View Post
    Nah the problem there is that Solr is ooollld.
    It's obvious that it's old. I knew the moment I was forced to start using it, I was like "jesus how old is this".


    there's no getting away from the XML.
    Putting SQL and javascript transformers in an XML file is inexcusable!

    Otherwise Solr is just a better product.
    I highly suspect "better" becomes immediately subjective based on what you want. It's only just got nested documents, while you can dump what you want into ES with no problem. IMO the documentation is garbage, maybe you don't think so, but having to use it has caused us numerous pains. I have yet to see any reason that would make it "better".

    Edit: Correction, this oddity that the elk docker image requires max_map_count set on the host (well, increased from default) to even run is a bit silly. I've never seen anything else require modifications to the host to run. Considering that's memory use related you may have a point.
    Last edited by Frug; October 30 2017 at 12:09:35 AM.

    Quote Originally Posted by Loire
    I'm too stupid to say anything that deserves being in your magnificent signature.

  5. #2405
    Movember '11 Best Facial Hair, Best 'Tache Movember 2011Movember 2012Donor helgur's Avatar
    Join Date
    April 24, 2011
    Location
    Putting owls in your Moss
    Posts
    8,369
    https://slproweb.com/products/Win32OpenSSL.html ragepost in red is an amusing read

  6. #2406
    Daneel Trevize's Avatar
    Join Date
    April 10, 2011
    Location
    T L A
    Posts
    12,018
    Shining Light Productions is a sole proprietorship
    Really? You'd never know from the Product Support section on that linked page, explaining a 0-tolerance policy for typos "since you ARE e-mailing a real developer"...
    Quote Originally Posted by QuackBot View Post
    Idk about that, and i'm fucking stupid.

  7. #2407
    Movember '11 Best Facial Hair, Best 'Tache Movember 2011Movember 2012Donor helgur's Avatar
    Join Date
    April 24, 2011
    Location
    Putting owls in your Moss
    Posts
    8,369
    Not really an uncommon occurance to have these sort of primadonna tendencies among developers though.

    Should have called it «Bacon of Shining light» for added comedy

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •