000webhost

Web hosting

Tuesday, August 13, 2019

Master to Master Database Replication Notes, Random Stuff, and More

- I've been working on some projects which may require master to master database replication capabilties. These are my research notes (these notes are from a long while back/maybe even years? so if they're not entirely up to date please excuse me) and are an obvious follow up form the following (sorry, if these notes seem a little bit difficult to read. They're mostly for my own reference):
- a lot of modern databases offer master to master replication capabilities now
database replication master to master
- NoSQL databases are the classical archetype of modern databases that we now get for Big Data operations. Basically, they aren't completely 100% ACID compliant like traditional database technology. Howeover, that doesn't really matter for a lot of modern applications which rely primarily on read only and a small number of write transactions. Obviously, I'm looking to build something more traditional but scales up better?
- the obvious method is to leverage off of existing P2P and blockchain technology. There are a few ultrafast blockchain implementations out there but they are very rare (it's easy to see why cryptocurrencies aren't a genuine threat to the modern banking system if you understand how many current cryptocurrency options work). Just like a lot of other things there are often design compromises that needs to be made to achieve this though. You can't get the best of everything
blockchain algorithms
p2p algorithms
fastest open source blockchain
- obvious that benchmarks you see tend to be dependent and can't really be correlated/fair across the board for whatever purpose you may be seeking. Example of this are NoSQL databases which often have a small number of write nodes for a large number of read nodes
- following is a really strange run down of database performance across the board. Notice the unexpected locations of some of them? A lot of it seems 'out of place' if you're not super famiiliar with this field? Note a lot of the time benchmarks are often skewed towards what a particular manufacturer wants. Obviously, the most fascinating situations are where 'linear scaling' seems to be achieved. I've seen this across high end server hardware but we're seeing it increasingly at lower levels now as well
database performance comparison statistics
342 systems in ranking, April 2018
Rank DBMS Database Model Score
Apr
2018 Mar
2018 Apr
2017 Apr
2018 Mar
2018 Apr
2017
1. 1. 1. Oracle detailed information Relational DBMS 1289.79 +0.18 -112.21
2. 2. 2. MySQL detailed information Relational DBMS 1226.40 -2.46 -138.22
3. 3. 3. Microsoft SQL Server detailed information Relational DBMS 1095.51 -9.28 -109.26
4. 4. 4. PostgreSQL detailed information Relational DBMS 395.47 -3.88 +33.69
5. 5. 5. MongoDB detailed information Document store 341.41 +0.89 +15.98
6. 6. 6. DB2 detailed information Relational DBMS 188.95 +2.28 +2.29
7. 7. 7. Microsoft Access Relational DBMS 132.22 +0.27 +4.04
8. up arrow 9. up arrow 11. Elasticsearch detailed information Search engine 131.36 +2.81 +25.69
9. down arrow 8. 9. Redis detailed information Key-value store 130.11 -1.12 +15.75
10. 10. down arrow 8. Cassandra detailed information Wide column store 119.09 -4.40 -7.10
11. 11. down arrow 10. SQLite detailed information Relational DBMS 115.99 +1.17 +2.19
12. 12. 12. Teradata Relational DBMS 73.68 +1.21 -2.88
13. 13. up arrow 17. Splunk Search engine 65.06 -0.61 +9.55
14. up arrow 15. up arrow 18. MariaDB detailed information Relational DBMS 64.56 +1.45 +15.83
15. down arrow 14. down arrow 14. Solr Search engine 63.21 -1.60 -1.16
16. 16. down arrow 13. SAP Adaptive Server detailed information Relational DBMS 61.63 -0.99 -5.83
17. 17. down arrow 15. HBase detailed information Wide column store 59.69 -1.24 +1.22
18. 18. up arrow 20. Hive detailed information Relational DBMS 57.40 +0.39 +15.75
19. 19. down arrow 16. FileMaker Relational DBMS 55.00 -0.12 -2.17
20. 20. down arrow 19. SAP HANA detailed information Relational DBMS 48.90 +0.37 +0.75
21. 21. up arrow 22. Amazon DynamoDB detailed information Multi-model 43.14 +0.69 +11.08
22. 22. down arrow 21. Neo4j detailed information Graph DBMS 40.90 -0.00 +5.99
23. up arrow 24. up arrow 24. Memcached Key-value store 33.79 +2.44 +4.26
24. down arrow 23. down arrow 23. Couchbase detailed information Document store 32.34 -0.56 +1.53
25. 25. 25. Informix Relational DBMS 26.60 -0.90 -0.19
26. 26. up arrow 27. Microsoft Azure SQL Database detailed information Relational DBMS 24.47 -0.15 +3.36
27. 27. up arrow 28. Vertica detailed information Relational DBMS 20.71 +0.41 +0.22
28. 28. down arrow 26. CouchDB Document store 19.85 -0.35 -2.52
29. 29. up arrow 30. Firebird Relational DBMS 18.63 +0.61 -0.45
30. up arrow 31. up arrow 54. Microsoft Azure Cosmos DB detailed information Multi-model 17.19 +0.43 +12.91
31. down arrow 30. down arrow 29. Netezza Relational DBMS 16.58 -0.41 -3.02
32. 32. down arrow 31. Amazon Redshift detailed information Relational DBMS 13.68 -0.51 +1.25
33. up arrow 34. up arrow 36. Google BigQuery detailed information Relational DBMS 12.88 +0.39 +3.65
34. down arrow 33. down arrow 32. Impala Relational DBMS 12.68 -0.37 +1.45
35. 35. up arrow 39. Spark SQL detailed information Relational DBMS 11.14 +0.06 +3.22
36. up arrow 38. up arrow 40. InfluxDB detailed information Time Series DBMS 10.76 +0.11 +3.52
37. up arrow 40. down arrow 35. dBASE Relational DBMS 10.66 +0.15 +0.42
38. down arrow 36. down arrow 33. MarkLogic Multi-model 10.63 -0.35 -0.57
39. down arrow 37. down arrow 34. Greenplum Relational DBMS 10.39 -0.40 -0.17
40. down arrow 39. Oracle Essbase Relational DBMS 9.33 -1.31
database benchmark comparison
- as usual, I've thought about building how to build one myself as well. At it's core you're basically trying to synchronise files as quickly as possible using a particular level of granularity that is dictated by your current situation. The most obvious two node setup gets the alternate side (the side that is getting written to) to freeze/pause (most DB technology has this capability) and then sync to the newest version of a file
pause sqlite database
https://blog.devart.com/increasing-sqlite-performance.html
- the next obvious option is to create a proxy which handles the management of database synchronisation. There are obviously existing proxies out there to start your base from
database proxy
nginx
perl udp and tcp proxy
Port forwarding utility in perl
perl web proxy
perl smtp mail server test
Quick perl script to test smtp auth
python web proxy
python web proxy
p2p in bash
web proxy using bash
perl smtp proxy
memcached on windows
- if using hashing for verification for synchronisation of data then selection of hasing algorithm is crucial along with likely possible collision rate
fastest hashing algorithms
Hash           Lowercase      Random UUID  Numbers
=============  =============  ===========  ==============
Murmur            145 ns      259 ns          92 ns
                    6 collis    5 collis       0 collis
FNV-1a            152 ns      504 ns          86 ns
                    4 collis    4 collis       0 collis
FNV-1             184 ns      730 ns          92 ns
                    1 collis    5 collis       0 collis
DBJ2a             158 ns      443 ns          91 ns
                    5 collis    6 collis       0 collis
DJB2              156 ns      437 ns          93 ns
                    7 collis    6 collis       0 collis
SDBM              148 ns      484 ns          90 ns
                    4 collis    6 collis       0 collis**
SuperFastHash     164 ns      344 ns         118 ns
                   85 collis    4 collis   18742 collis
CRC32             250 ns      946 ns         130 ns
                    2 collis    0 collis       0 collis
LoseLose          338 ns        -             -
               215178 collis
Name Speed Quality Author
xxHash 5.4 GB/s 10 Y.C.
MurmurHash 3a 2.7 GB/s 10 Austin Appleby
SBox 1.4 GB/s 9 Bret Mulvey
Lookup3 1.2 GB/s 9 Bob Jenkins
CityHash64 1.05 GB/s 10 Pike & Alakuijala
FNV 0.55 GB/s 5 Fowler, Noll, Vo
CRC32 0.43 GB/s 9
MD5-32 0.33 GB/s 10 Ronald L.Rivest
SHA1-32 0.28 GB/s 10
- simulate latency issues. wondershaper will get the job done if you're living on a quiet network. Else, you need to setup custom scripts via tc
simulate latency
- you may need a cloud management system of some sort. Either rely on existing cloud systems, come up with our own, hack something together yourself, etc...
mesos
- most basic would be high performance data replication/synchronisation, with configuration and optimisation of database program, configuration and optimisation of database schema, and achievement of some sort of full duplex communication mechanism. Threading type wouldn't make much of a difference here?
sqlite master master replication
1 Full duplex
2 Full-duplex emulation

    2.1 Time-division duplexing
    2.2 Frequency-division duplexing
    2.3 Echo cancellation
multithreaded code examples
multithreaded vs single threaded applications
A standardized interface for thread implementation is POSIX Threads (Pthreads), which is a set of C-function library calls. OS vendors are free to implement the interface as desired, but the application developer should be able to use the same interface across multiple platforms. Most Unix platforms including Linux support Pthreads. Microsoft Windows has its own set of thread functions in the process.h interface for multithreading, like beginthread. Java provides yet another standardized interface over the host operating system using the Java concurrency library java.util.concurrent.

Multithreading libraries provide a function call to create a new thread, which takes a function as a parameter. A concurrent thread is then created which starts running the passed function and ends when the function returns. The thread libraries also offer synchronization functions which make it possible to implement race condition-error free multithreading functions using mutexes, condition variables, critical sections, semaphores, monitors and other synchronization primitives.

Another paradigm of thread usage is that of thread pools where a set number of threads are created at startup that then wait for a task to be assigned. When a new task arrives, it wakes up, completes the task and goes back to waiting. This avoids the relatively expensive thread creation and destruction functions for every task performed and takes thread management out of the application developer's hand and leaves it to a library or the operating system that is better suited to optimize thread management. For example, frameworks like Grand Central Dispatch and Threading Building Blocks.

In programming models such as CUDA designed for data parallel computation, an array of threads run the same code in parallel using only its ID to find its data in memory. In essence, the application must be designed so that each thread performs the same operation on different segments of memory so that they can operate in parallel and use the GPU architecture.
- you can obviously use off the shelf message brokers obviously if you don't want to build something froms scratch
Due to its widespread integration into enterprise-level infrastructures, monitoring Kafka performance at scale has become an increasingly important issue. Monitoring end-to-end performance requires tracking metrics from brokers, consumer, and producers, in addition to monitoring ZooKeeper which is used by Kafka for coordination among consumers.[7][8] There are currently several monitoring platforms to track Kafka performance, either open-source, like LinkedIn's Burrow, or paid, like Datadog. In addition to these platforms, collecting Kafka data can also be performed using tools commonly bundled with Java, including JConsole.[9]
p2p message broker
- alternative is you build superfast custom microservices/message broker proxy. It's a matter of maintaining consistency and speed though
microservice
Microservices is a software development technique—a variant of the service-oriented architecture (SOA) architectural style that structures an application as a collection of loosely coupled services. In a microservices architecture, services are fine-grained and the protocols are lightweight. The benefit of decomposing an application into different smaller services is that it improves modularity and makes the application easier to understand, develop, test, and more resilient to architecture erosion.[1] It also parallelizes development by enabling small autonomous teams to develop, deploy and scale their respective services independently.[2] It also allows the architecture of an individual service to emerge through continuous refactoring.[3] Microservices-based architectures enable continuous delivery and deployment.[1][4]
memcached sql
- bitmap indexes used to achieve higher performance then traditional relational databases
https://en.wikipedia.org/wiki/Ralph_Kimball
- text based databases are easier to deal with then binary based databases
perl database replication sourcecode master master
java database replication sourcecode master master
- this is where the problems with Myki come into perspective. It comes down to a combination of raw databsase performance, constency issues, connectivity issues, etc... When it comes down to it, it's actually a really difficult task to deal with. I've come across implementations that were of a much less diminished level. An interesting exercise for anyone is to run through the excercise themselves? The following puts the complexity of the problem into perspective
~500 million trips a year which is (math off?): 
500000000/365=1369863.01369863 trips per day
1369863.01369863÷24=57077.625570776 trips per hour
57077.625570776/60=951.293759513 trips per minute
total travellers myki network
- single node prototype up first. We'll start things at their most basic, queue control, traffic control systems, proxies, leaking bucket type stuff
- two node prototype. Parallelise then get n times speedup but you lose sync. It's similar to row based versus column based technology methodology? In line protocol optimisation, simplified schema, proxies on either side, time consistency critical which means time synching critical, data synchronisation similar to data compression protocols... Replication is basically database upon database to help achieve ACID in multi-layered context. Synchorinisation is pretty easy by freezing/delaying each one. Queue with finite limit to stop DoS attacks? One DB on top of another. Timing critical 
- n-node prototype. Depends on total number of connections and redundant mechanisms? Many different levels of abstraction, variations of topology, etc... possible which obviously results in differing performance, load balancing, locking, redundacy, ACID compliance, etc... In memory/RAM for greater performance, hardware specification, software configuration, operation system optimisation, proxy type (lots of different proxies out there if you want to research this), architectural design... Grid type arrangement - relatively slow but good enough for their purposes. Single star with hub write and all other read is obvious - bespoke can do similar if number of write to read transactions is relatively small (writes are small in comparison with reads). Similar in design decisions made for SQLite versus MySQL. Wonder whether modified/dynamic topologies are realistic (refer back to dynamic routing protocols such as BGP)?
network topology
There are seven basic topologies:[4]
- Point-to-point topology
- Bus (point) topology
- Star topology
- Ring topology
- Tree topology
- Full/partial Mesh topology
- Hybrid topology
Which of these is chosen depends on what devices need to be connected, how reliable it has to be, and the cost associated with cabling.
non blocking io
distributed locking
cache
1 Motivation
1.1 Latency
1.2 Throughput
2 Operation
2.1 Writing policies
3 Examples of hardware caches
3.1 CPU cache
3.2 GPU cache
3.3 DSPs
3.4 Translation lookaside buffer
4 Software caches
4.1 Disk cache
4.2 Web cache
4.3 Memoization
4.4 Other caches
5 Buffer vs. cache
6 See also
7 References
8 Further reading
- network based redundancy and load balancing. I've obviously built stuff like this in the past. Curious to see what performance difference is between Nginx and custom built code?
- the more involved in IT you are the more you realise many things can be implemented very simply. A very simple load balancer is the following. This relies on threads/processes so obviously isn't as efficient as a microservice type architecture
nc ip address redirect
You can forward ports from userspace using netcat like this:
nc -l -p $localport -c "nc $remotehost $remoteport"
Example:
nc -l -p 8888 -c "nc example.com 8888"
Caution: This starts a new process for every connection to the port.

Random Stuff:
- as usual thanks to all of the individuals and groups who purchase and use my goods and services
- latest in science and technology
crypto investment fund
- latest in finance and politics
- latest in defense and intelligence
- latest in animal news
- latest in music and entertainment

Random Quotes:
- The key recommendation given by the authors of the report is “the United States, other NATO allies and partners, and the European Union could take further concrete steps to support the Baltic countries in developing their total defense and unconventional warfare capabilities.” And they will take these steps even if Latvia is not ready to accept their help.
- US Secretary of State Mike Pompeo was right to call out China’s two-faced duplicity in a tweet on 27 March: ‘The world cannot afford China’s shameful hypocrisy toward Muslims. On one hand, China abuses more than a million Muslims at home, but on the other, it protects violent Islamic terrorist groups from sanctions at the UN.’ Placing the item on the agenda for open Security Council debate and decision would have deepened the reputational damage to China for running a diplomatic protection racket for Pakistan-origin terrorists.
- But the threats to the carrier are mounting, experts say. With the advent of ground-launched hypersonic missiles, it’s a matter of time before air-launched hypersonic missiles present a nearly insurmountable threat, barring a significant development to counter them.

“I think what King’s comments reflect is that he sees the vulnerability of the aircraft carrier only getting worse,” said Bryan Clark, a retired submarine officer and analyst with the Center for Strategic and Budgetary Assessments. “Specifically, maybe not so much these kind of boost-glide weapons, but its more about cruise missiles that are hypersonic – air launched perhaps.

“Then you are talking about something that is relatively inexpensive and could be delivered in large numbers and that would be a bigger deal. Because missile defenses are not necessarily built for hypersonic weapons.

“So we’ll have to find a way to deal with this new challenge, or we’ll have to rethink how we do things.”
- In 1988 military spending was a single line item in the state budget, totaling 21 billion rubles, or about US$33 billion. Given the size of the military establishment, however, the actual figure was at least ten times higher. Western experts concluded that the 21 billion ruble figure reflected only operations and maintenance costs. The amount spent on Soviet weapons research and development was an especially well-guarded state secret, and other military spending, including training, military construction, and arms production, was concealed within the budgets of all-union ministries and state committees. Apart from considerations of state secrecy, this allocation of military spending to ministries other than the Ministry of Defense reflected the Soviet approach to managing resource allocation. Weapons produced by agencies such as the Ministry of General Machinebuilding [missiles] or the Ministry of Shipbuilding Industry [ships] were essentially provided as "free goods" to the Ministry of Defense.

Since the mid-1980s, the Soviet Union devoted between 15 and 17 percent of its annual gross national product to military spending, according to United States government sources. Until the early 1980s, Soviet defense expenditures rose between 4 and 7 percent per year. Subsequently, they slowed as the yearly growth in Soviet GNP slipped to about 3 percent. In 1987 Gorbachev and other party officials discussed the extension of glasnost' to military affairs through the publication of a detailed Soviet defense budget. In early 1989, Gorbachev announced a military budget of 77.3 billion rubles, but Western authorities estimated the budget to be about twice that.
- The most recent formal US report on Chinese currency manipulation — from May this year — found no evidence of Chinese intervention.

“Based on Treasury estimates, direct intervention in foreign exchange markets by the People’s Bank of China over the last several months appears limited,” the report said, declining to label China as a currency manipulator.

But that all changed today.

Technically, it may be the case that China was manipulating the value of its currency earlier and what made it fall is that they stopped manipulating. But that is by the by. Trump promised during his campaign to designate China as a currency manipulator, and now he has.

So the trade war goes on. China has already chosen to retaliate by not buying US farm products. The question is how bad the economic casualties will have to get before the two sides call a truce.
- Ma says blaming China for any economic issues in the U.S. is misguided. If America is looking to blame anyone, Ma said, it should blame itself.
“It’s not that other countries steal jobs from you guys,” Ma said. “It’s your strategy. Distribute the money and things in a proper way.”
He said the U.S. has wasted over $14 trillion in fighting wars over the past 30 years rather than investing in infrastructure at home.
To be sure, Ma is not the only critic of the costly U.S. policies of waging war against terrorism and other enemies outside the homeland. Still, Ma said this was the reason America’s economic growth had weakened, not China’s supposed theft of jobs.
In fact, Ma called outsourcing a “wonderful” and “perfect” strategy.
“The American multinational companies made millions and millions of dollars from globalization,” Ma said. “The past 30 years, IBM, Cisco, Microsoft, they’ve made tens of millions — the profits they’ve made are much more than the four Chinese banks put together. ... But where did the money go? ”
He said the U.S. is not distributing, or investing, its money properly, and that’s why many people in the country feel wracked with economic anxiety. He said too much money flows to Wall Street and Silicon Valley. Instead, the country should be helping the Midwest, and Americans “not good in schooling,” too.
“You’re supposed to spend money on your own people,” Ma said. “Not everybody can pass Harvard, like me.” In a previous interview, Ma said he had been rejected by Harvard 10 times. 
Along those lines, Ma stressed that globalization is a good thing, but it, too, “should be inclusive,” with the spoils not just going to the wealthy few.
“The world needs new leadership, but the new leadership is about working together,” Ma said. “As a business person, I want the world to share the prosperity together.”
- Today, less than a third of Koreans have tried dog meat, and even fewer people eat it regularly but a considerable number of restaurants in the country still have it on the menu. In Seoul it's fairly hard to find, but there are dog abattoirs on the outskirts of the capital – at markets or farms. Young people in particular don't appreciate the idea of munching on a pet; on my trip, I asked a local couple of twenty-somethings where I might be able to eat dog, and they both responded with obvious disgust and irritation.
- “The upgrade will boost the fighter’s combat capabilities: it will increase the range of detecting and identifying air targets and furnish the plane with new precision weapons to hit air, ground and sea targets at a range of several hundred kilometres,” Russian state media reported.

The Su-30SM upgrade is touted to boost the fighter’s combat capabilities by increasing the range of detection and identification of aerial targets. According to the Russian media, the upgrade will furnish the plane with new precision weapons to hit air, ground and sea targets at a range of several hundred kilometres.

On the other hand, Russia’s most advanced fighter till date, the Su-57 fifth-generation stealth fighter is powered by two engines: AL-41F-1 and the Izdelie 30. The former produces 144.5kN (14,734 kgf) while the latter produces 189kN (19,272 kgf) of afterburning thrust. The main avionics systems are the Sh121 Multifunctional Integrated Radio Electronic System (MIRES) and the 101KS Atoll Electro-Optical System.

The Sh121 MIRES consists of the N036 Byelka radar and the L402 Himalayas electronic counter-measures systems. The N036 Byelka radar features five AESA (Active Electronically Scanned Arrays) systems, three X-band and two L-band, and it’s also the first AESA radar used on a Russian aircraft. 

The Su-57 entered serial-production of the jets this July, and the first of the 76 ordered jets will be delivered to the Russian forces by the end of the year. An export version of the jet, the Su-57E, could be offered to prospective customers in the Middle East and Asia at the Dubai Air show in November this year. According to the reports, an aircraft carrier version of the fighter is also in the cards.

Market Consolidation/Neo-Feudalism, Random Stuff, and More

- it never occured to me until recently how consolidated things in the world were in the global market place. In this post we'll take a ...