000webhost

Web hosting

Monday, October 14, 2019

Web Crawler Script, Random Stuff, and More

- I had to build a custom web crawler/spider recently (part of it for computer security interests, part for personal interest, and partly for my medical searh engine that I've been working on). I didn't expect that many spiders/crawlers basically downloaded entire sites prior to indexing, were highly dependent massive, etc... Much like my previous experiences with custom downloaders I figured out that building something was easier, better, faster, etc... then trying to find something. Download it here:
web_crawler-1.05.zip
http://sites.google.com/site/dtbnguyen/
- description is as follows:
# My local setup often doesn't allow me enough control to run Python,
# Java, or Ruby based crawlers. Moreover, these solutions are often slow,
# lack flexibility, very limited, inefficient, lack feedback, etc... so I
# decided to build my own. It has some limitations such as not being able
# to deal with Javascript, being slow relatively to crawlers built using
# compiled languages, etc... but I'm fine with that as my needs are
# pretty basic. Good for automatically finding pages that may be useful
# but you don't know about. Use in combination with a mass downloader
# of some sort and it's obvious this can often be much more efficient then
# pure website downloaders such as Teleport Pro and HTTrack:
# http://www.tenmax.com/teleport/exec/home.htm
# http://www.httrack.com/
#
# Given it's nature it's probably better suited for smaller projects.
#
# As this is the very first version of the program it may be VERY buggy.
# Please test prior to deployment in a production environment.
#
- I obviously did some prior reearch on this. Surprisingly, there wasn't much in the way of something that I wanted
lightweight web crawler spider
https://github.com/BruceDone/awesome-crawler
https://scrapinghub.com/open-source
https://bulkscraping.com/blog/top8-free-tools-for-automated-web-scraping/
bash web spider
https://gist.github.com/antoineMoPa/ada42dcfc96197e38dc8c4df363aed72
https://williamjturkel.net/2013/09/29/writing-a-simple-web-spider-using-command-line-tools-in-linux/
extract all links from website linux
https://unix.stackexchange.com/questions/116987/how-do-i-use-wget-to-download-all-links-from-my-site-and-save-to-a-text-file
https://adamdehaven.com/blog/easily-crawl-website-and-fetch-all-links-with-shell-script/
https://github.com/adamdehaven/fetchurls
https://stackoverflow.com/questions/2804467/spider-a-website-and-return-urls-only
pdf site:https://thebillionairejournal.wordpress.com/
pdf thebillionairejournal
httrack links only
https://forum.httrack.com/readmsg/24984/24983/index.html
https://forum.httrack.com/readmsg/36436/36381/index.html
HTTrack uses a Web crawler to download a website. Some parts of the website may not be downloaded by default due to the robots exclusion protocol unless disabled during the program. HTTrack can follow links that are generated with basic JavaScript and inside Applets or Flash, but not complex links (generated using functions or expressions) or server-side image maps.
https://en.wikipedia.org/wiki/HTTrack
https://github.com/xroche/httrack

Random Stuff:
- as usual thanks to all of the individuals and groups who purchase and use my goods and services
http://sites.google.com/site/dtbnguyen/
http://dtbnguyen.blogspot.com.au/
- latest in science and technology
https://www.rt.com/news/470642-amazon-cloud-cam-spying-algorithms/
Random String Generator in C++
http://www.cplusplus.com/forum/windows/88843/
https://ostinato.org/
https://www.solarwinds.com/topics/traffic-generator-wan-killer
https://uniclixapp.com/
https://www.bitcoinometrics.club/
https://www.facebook.com/groups/ChooseFI/
https://www.facebook.com/groups/StartupsMelbourne/
https://www.cnbc.com/2019/10/08/carousell-how-3-friends-turned-used-goods-into-a-550-million-startup.html
https://www.arnaudbonzom.com/reports-and-articles-on-asian-digital-economies-and-startup-ecosystems-with-a-focus-on-southeast-asia/
https://www.eventbrite.com.au/e/keys-to-success-melbourne-october-2019-registration-68576802061?fbclid=IwAR2q8Hgf40RiJfIDyA28ZR6g2KAZNEX2H8MtQfaKzGk-p2Qq3cYZog7UKkw
https://www.nytimes.com/2019/10/08/technology/silicon-valley-startup-profit.html
https://phys.org/news/2019-09-nonviral-gene-therapy-cancer.html
https://www.grahamcluley.com/stalker-zoomed-in-on-japanese-idols-eyes-to-find-out-where-she-lived/
https://arstechnica.com/information-technology/2019/10/former-yahoo-engineer-admits-he-hacked-user-accounts-in-search-of-sexual-images/
https://www.rt.com/news/470829-russia-moon-base-rent/
https://www-microsoft-com.cdn.ampproject.org/c/s/www.microsoft.com/en-us/microsoft-365/blog/2019/09/18/why-banks-adopt-modern-cybersecurity-zero-trust-model/amp/
https://www.itwire.com/health/startup-%E2%80%98performance-risks%E2%80%99-compromised-by-founder-wellbeing-report-2.html
https://www.inc.com/peter-economy/microsoft-career-experts-say-that-your-resume-should-always-have-these-5-simple-things.html
https://www.abc.net.au/news/science/2019-10-12/koala-retro-virus-shed-light-on-second-immune-system/11583782
https://cointelegraph.com/news/german-programmer-hacks-back-after-bitcoin-ransomware-attack
https://www.sfgate.com/business/technology/article/China-criticizes-Apple-for-app-that-tracks-Hong-14503012.php
https://www.businessinsider.com.au/uber-lyft-driver-how-much-money-2019-10
https://itwire.com/strategy/swinburne-uni-uses-ai-to-detect-fast-radio-bursts-in-real-time.html
https://www.thesun.co.uk/uncategorized/9699009/ancient-signals-from-outer-space-detected/
https://www.nbcnews.com/mach/science/master-plan-universe-revealed-new-galaxy-maps-ncna1040936
https://phys.org/news/2019-08-physicists-year-old-optical-problem.html
https://www.geekwire.com/2019/geek-street-inspires-summit-attendees-point-tech-leaders-family-others/
https://www.sciencenews.org/article/proposed-space-telescope-would-use-earth-atmosphere-lens
https://www.news.com.au/technology/science/human-body/clandestine-drug-labs-can-leave-houses-toxic-for-years-after-detection/news-story/20ba317bec2923dbc6504eeefdad56e8
https://www.afp.com/en/news/23/real-fake-research-hoodwinks-us-journals-doc-19r9uv4
http://newport.classifieds-free.co.uk/jobs/welsh-personalized-internet-assessor_hsk3857
https://www.newscientist.com/article/2213058-milky-ways-black-hole-has-got-75-times-brighter-and-we-dont-know-why/
https://www.zerohedge.com/personal-finance/third-world-3-million-californians-lose-power-pge-begins-unprecedented-blackouts
https://www.tweaktown.com/news/68054/elon-musk-nasa-free-share-spacex-ip-anyone-wants/index.html
https://www.itwire.com/open-sauce/windows-ransomware-when-will-people-in-charge-ever-learn.html
https://www.itwire.com/open-source/open-source-vendor-suse-gives-openstack-the-boot.html
https://medium.com/@mahakaal/malware-engineering-part-0x1-that-magical-elf-5be3556ecb2b
https://onezero.medium.com/the-future-of-a-i-is-probably-chinese-ab6a8cf5c927
https://www.linkedin.com/feed/news/telstra-workers-are-too-productive-4726956/
- latest in finance and politics
Axios Markets
https://www.axios.com/newsletters/axios-markets-894f0dbf-1d0e-4b04-9e2e-548ec63d6da3.html
https://www.policyforum.net/the-nba-in-china-who-calls-the-shots/
https://www.sbs.com.au/news/the-united-nations-is-in-a-severe-financial-crisis-this-is-what-that-means
https://www.rt.com/op-ed/470662-peter-handke-nobel-controversy/
https://gizmodo.com/apple-sells-out-pro-democracy-protesters-in-hong-kong-t-1838932096
https://www.themonthly.com.au/blog/mungo-maccallum/2019/07/2019/1570414741/mateship-what-cost
https://www.commondreams.org/news/2019/09/17/cutting-health-benefits-1900-whole-food-workers-saved-worlds-richest-man-jeff-bezos
https://independentaustralia.net/politics/politics-display/robodebt-claims-the-life-of-a-19-year-old-mum,13127
https://www.afr.com/companies/financial-services/regulators-want-customers-to-switch-banks-20191010-p52zgk
https://www.afr.com/chanticleer/behind-the-open-banking-revolution-20190626-p521k7
https://www.theguardian.com/australia-news/2019/oct/14/disturbingly-lightweight-penny-wong-targets-morrison-over-china-and-negative-globalism
https://www.theguardian.com/commentisfree/2019/oct/13/queen-might-be-admired-crown-has-seldom-seemed-so-empty
https://www.abc.net.au/news/2019-09-21/donald-trump-state-dinner-scott-morrison-welcome-distraction/11533602
https://www.abc.net.au/news/2019-09-06/media-freedom-is-at-risk-when-politics-and-fundraising-mix/11484434
https://www.pbs.org/newshour/press-releases/pbs-newshour-series-china-power-and-prosperity-explores-todays-china
https://www.news.com.au/technology/environment/climate-change/abcs-insiders-program-ends-with-snide-swipe-at-teenage-activist-greta-thunberg/news-story/4b66071882839d6de96b5ad2bab10972
https://www.theguardian.com/business/2019/sep/21/wrong-place-at-the-wrong-time-how-the-us-china-trade-war-is-putting-the-squeeze-on-australia
https://www.businessinsider.com.au/trucking-truck-drivers-job-loss-september-2019-10
https://www.afr.com/world/asia/why-china-s-one-way-door-on-investment-is-not-working-20191009-p52z2d
https://www.abc.net.au/news/2019-10-08/white-supremacist-neo-nazi-concert-in-melbourne-to-go-ahead/11582120
https://www.theage.com.au/national/sexist-racist-and-homophobic-why-i-m-glad-larrikin-humour-is-gone-20191010-p52zd8.html
https://www.theguardian.com/australia-news/2019/oct/05/qanon-conspiracy-theorist-friends-australian-prime-minister-scott-morrison
https://www.theguardian.com/australia-news/2019/oct/05/scott-morrison-unleashes-foreign-policy-fetish-while-staying-passive-on-economy
https://www.abc.net.au/news/2019-10-11/cheap-as-chips-to-stop-selling-glue-mouse-traps-peta-says/11592804
https://www.abc.net.au/life/beginners-guide-to-buying-property/11437736
https://www.abc.net.au/news/2019-03-14/couple-went-without-water-electricity-to-save-home-deposit/10897324
https://www.abc.net.au/news/2019-10-05/jon-faines-career-of-questions-on-abc-melbourne/11542080
https://www.abc.net.au/news/2019-10-04/hong-kong-bans-masks-for-protesters-explainer/11573842
https://www.presstv.com/Detail/2019/09/28/607382/Iran-bank-branches-closure-announcement
https://www.rt.com/op-ed/467716-harvard-wealth-countries-status-quo/
https://www.linkedin.com/feed/news/want-to-be-a-ceo-pack-your-bags-4423731/
https://www.afr.com/work-and-careers/leaders/gen-x-ceos-are-different-to-what-you-would-expect-20190901-p52mtv
https://www.msn.com/en-au/news/australia/granddad-forced-to-repay-robodebt-he-believes-he-doesnt-owe/ar-AAIum5f?li=AAgfYrC
https://au.finance.yahoo.com/news/australia-at-risk-from-south-korea-japan-trade-war-053201024.html
https://www.dailytelegraph.com.au/news/world/how-to-know-which-retailers-will-take-the-biggest-hit-from-a-trade-war/video/132d93963f61bf15965feb3bfd50686b
https://www.ifallsjournal.com/news/opinion/what-others-say-trade-war-unilaterally-triggered-by-u-s/article_baab5e8f-9589-58be-a949-e66617354075.html
http://www.travelweekly.com.au/article/passenger-sexually-harassed-via-virgin-atlantic-in-flight-messages/
https://www.investordaily.com.au/markets/45737-trump-s-trade-war-has-backfired-dws
https://www.investordaily.com.au/analysis/45731-ongoing-trade-war-could-push-aud-below-60-cents-how-could-investors-potentially-profit-from-this
https://www.linkedin.com/feed/news/the-great-diversity-conundrum-5054298/
https://www.linkedin.com/feed/news/is-this-permission-to-steal-4712916/
https://www.zerohedge.com/markets/return-hyperinflation-zimbabwe
https://www.zerohedge.com/political/americas-political-implosion
- latest in defense and intelligence
https://www.aspistrategist.org.au/projecting-power-with-the-f-35-part-1-how-far-can-it-go/
https://www.aspistrategist.org.au/projecting-power-with-the-f-35-part-2-going-further/
https://www.aspistrategist.org.au/projecting-power-with-the-f-35-part-3-operational-implications/
https://au.news.yahoo.com/iraq-war-whistleblower-film-bush-blair-sights-100150905--spt.html
https://www.aljazeera.com/programmes/listeningpost/2019/10/iraq-protests-establishment-fighting-heard-191012085331859.html
https://nationalinterest.org/blog/buzz/what-would-happen-if-america-killed-north-koreas-kim-jong-un-87656
https://www.msn.com/en-us/news/world/s-koreas-army-says-will-cut-force-by-100000-in-three-years/ar-AAID0Ik
https://defense-update.com/20101231_arming-the-shadows.html
https://www.weeklyblitz.net/news/shadow-war-between-iran-and-israel/
https://www.zerohedge.com/geopolitical/did-china-just-announce-end-us-primacy-pacific
https://www.zerohedge.com/geopolitical/serious-malfunction-how-french-intelligence-overlooked-terrorist-their-ranks
https://www.rt.com/news/470667-iranian-tanker-explosion-jeddah/
https://www.dw.com/en/should-turkey-fear-the-long-arm-of-us-sanctions/a-50762837
https://au.news.yahoo.com/iran-tanker-hit-suspected-missile-strikes-near-saudi-074024883--spt.html
https://www.abc.net.au/news/2019-10-10/record-number-chinese-military-personnel-at-defence-conference/11587580
https://www.abc.net.au/news/2019-08-19/australia-urged-to-divert-middle-east-military-operations/11425648
https://sputniknews.com/world/201910101077012589-top-secret-radar-upgrade-reportedly-allowed-us-navy-to-finally-spot-ufos/
https://www.smh.com.au/world/north-america/under-attack-from-allies-trump-says-kurds-never-helped-us-in-wwii-20191010-p52zd6.html
https://www.independent.co.uk/news/world/europe/turkey-russia-us-missile-defence-system-nato-mike-pence-trump-a8854276.html
https://www.theguardian.com/world/2019/oct/09/protecting-rioters-china-warns-apple-over-app-that-tracks-hong-kong-police
https://www.presstv.com/Detail/2019/10/09/608214/America-cant-be-trusted-at-all-it-stabs-own-allies-in-the-back-Nasrallah
https://www.presstv.com/Detail/2019/09/28/607378/US-Pentagon-Suicide-Interview-Bennett
https://sputniknews.com/analysis/201910051076964298-india-should-seek-russias-help-to-develop-hypersonic-missile-security-analyst/
https://www.thehindu.com/news/national/shooting-down-our-own-chopper-big-mistake-says-iaf-chief/article29593737.ece
https://www.indiatoday.in/india/story/indian-air-force-planes-crashes-personnel-killed-in-2019-1603033-2019-09-25
https://www.indiatoday.in/india/story/questions-raised-cheetah-choppers-flight-worthiness-after-crash-1605877-2019-10-03
https://www.rt.com/op-ed/470020-twitter-psyops-british-officer-media/
https://www.channelnewsasia.com/news/singapore/exercise-forging-sabre-apache-fighter-pilots-enemy-data-faster-11980914
http://www.defencenews.in/article/Rafale-engine-manufacturer-Safran-offers-to-help-India-develop-1st-indigenous-Jet-Engine-737338
- latest in animal news
https://www.smh.com.au/sport/stubborn-as-a-mule-the-champion-racehorse-that-went-on-strike-20191007-p52ydd.html
https://sputniknews.com/videoclub/201910101077008870-baby-dog-is-beseeched-by-raccoons/
https://www.onenewspage.com.au/n/Front+Page/1zkk83gpyi/Alaska-Zoo-polar-bear-treated-for-medical-condition.htm
- latest in music and entertainment
https://www.geelongadvertiser.com.au/lifestyle/food/big-myth-about-jam-doughnuts-busted/news-story/54352dc48d325bcfd918b58b14ac3fbf
https://www.sbs.com.au/food/article/2019/10/05/pancake-meets-doughnut-koreas-hotteok
https://www.australiangolfdigest.com.au/phil-mickelson-explains-why-he-hit-driver-out-of-that-bush-as-only-phil-mickelson-can/
https://www.golfchannel.com/news/so-long-farewell-henrik-stenson-returns-houston-without-his-trusty-3-wood
https://www.foxsports.com.au/basketball/nba/ben-simmons-made-a-threepointer-this-is-not-a-drill/news-story/c88d945992cf675bee4722f80b02adef
https://www.afp.com/en/news/3955/kipchoge-monastic-marathon-history-maker-doc-1l977z9
https://www.msn.com/en-au/lifestyle/style/dollar3000-jesus-shoes-filled-with-holy-water-sell-out-in-minutes/ar-AAIzFh1?li=AAgfYrC
https://www.news.com.au/sport/motorsport/formula-one/charles-leclerc-vs-sebastian-vettel-ferraris-costly-rivalry/news-story/1bc3f3da7ea00a90d2e20334fddce795
https://www.news.com.au/sport/motorsport/formula-one/ricciardo-settles-lawsuit-over-reported-18m-dispute/news-story/3681f35a6f4482724d9556658645e200
https://www.news.com.au/lifestyle/health/johnson-johnson-ordered-to-pay-118b-after-drug-caused-man-to-grow-breasts/news-story/faba3ea99dfbb16f9ccca21b47f9c948

Random Quotes:
- When the Japanese space agency JAXA bombed the asteroid Ryugu in April, it made a far bigger crater than anticipated. Additionally, the material on the surface behaved like sand, which may impact the effectiveness of deflection. 

“If gravity is also dominant at Didymos B, even though it is much smaller, we could end up with a much bigger crater than our models and lab-based experiments to date have shown,” explained planetary scientist Patrick Michel of CNRS.

“Ultimately, very little is known about the behaviour of these small bodies during impacts and this could have big consequences for planetary defence.”

The DART will ram into Didymos B at 23,760 kilometers per hour (14,760mph). However, all that force will only translate into a change in the asteroid’s velocity of just a centimetre per second or so, which could change the orbital period from almost 12 hours to a mere matter of minutes.
- My PoV regarding the Su-57 is that it'd be an impressive fighter if it were 4th gen, it'd probably be able to outmaneuver Eurocanards, F-18s, F-16s, etc. Against 5th gen, it'll outmaneuver F-35s and probably will outmaneuver the F-22.

However, the weakness in stealth and lack of sensor fusion, as others have said, is a critical failing that prevents it from being able to compete with 5th gen in a 1v1 fight.

The trick is, though, if the Su-57 is as cheap as the Russians make it out to be, it doesn't have to be that stealthy. It could very well use true stealth aircraft to cover its approach, then kill and get killed with HOBS missiles in short-range fights and still win strategically.

Of course, as a pilot, you definitely wouldn't want to be in an Su-57, and if you were, you'd probably be fed some nostrum about how your maneuverability will save you from being bad at stealth and you'll believe it because the alternative is knowing you're cannon fodder.

That's the angle the Su-57 comes from, i.e, it's a heavyweight fighter that costs less than a middleweight fighter and has acceptable stealth characteristics. Think back to T-34 vs Tigers, as well as the fact that T-34 were supported by SU-152s and other SPGs that specialized in killing Nazi heavy tanks.

The Su-57 doesn't have to be good, provided the Russians crank enough out of the Sukhoi factories.

As for "raw materials cost to the Russian economy", the actual economics term is opportunity cost, but as I've stated before, markets mean that the opportunity cost is effectively nil because the Russians have so much difficulty getting people to uptake the raw materials.

The one last possibility about Su-57 is that the Chinese went full hog on 3D-printed titanium for their J-20s and J-31 prototypes. The advatage of 3D-printed titanium is that it both reduces weight compared to traditional titanium and cost compared to both carbon fiber and traditional titanium. Aerospace-grade carbon fiber costs between 45,000 to 220,000 per ton. Titanium powder on Alibaba can be obtained for the equivalent of 30,000 per ton.

So it's actually a riff off the MiG-25 brouhahaha, the MiG-25 was originally expected to be composed of titanium, making it an extremely agile fighter due to low wing loading. The Soviets, of course, decided to cut costs and were looking for interception anyways, so went with nickel steel instead. The Su-57, if it were using composites aggressively, would be quite lightweight and expensive. But if say, they went with titanium as a composites replacement, they'd would gain some weight but also significantly reduce costs.

The problem with this hypothesis is that it's the Chinese who have the 3D-printer technology, although it wouldn't be that challenging for the Russians have that as well.
- Chit chat doesn’t come naturally to all of us. But, as Lindsay Mannering writes in The New York Times, building rapport through casual conversation is key to thriving at work. It’s what helps our coworkers see us as the wonderful humans we are, and it inclines them to cut us some slack when we need it. To make small talk less awkward, though, remember you’re more likeable than you think. We are prone to judge our gabbing skills more harshly than our conversation partners. And it never hurts to prepare: Have a few small talk topics at the ready, so you’re not caught flat-footed when opportunity arises.
- The Arctic is home to at least 20-25% of the world's untapped fossil-fuel resources, along with minerals, including gold, silver, diamond, copper, titanium, graphite, uranium, and other rare earth minerals.

Canada, Denmark, Finland, Iceland, Norway, Russia, Sweden, and the US have established the Arctic Council, an intergovernmental cooperation forum, that holds biennial ministerial meetings on the region. It's an attempt to stabilize the Arctic and avoid conflict as corporations and governments rush into the area to seize new economic opportunities.

With the latest news of Russia militarizing the Arctic, this will not sit well with the Trump administration who has been trying to buy Greenland. The Arctic and its resources will be a major topic of the 2020s and beyond, the first to secure its military in the region could become the next superpower of the world.

Dodgy Job Contract Clauses, Random Stuff, and More

- in this post we'll be going through dodgy job contract clauses. Ironically, many of which are actually unlawful and unenforceable on c...