Difference between revisions of "Computer Productivity Hacks"

From TeleCafeWiki
Jump to navigation Jump to search
(More wget tools and links; OpenRefine, Data Wrangler.)
(Add multiple resources.)
Line 89: Line 89:
 
* [https://github.com/coolwanglu/pdf2htmlEX pdf2htmlEX]  
 
* [https://github.com/coolwanglu/pdf2htmlEX pdf2htmlEX]  
 
: Convert PDF to HTML without losing text or format.
 
: Convert PDF to HTML without losing text or format.
 +
 +
== Educate Yourself ==
 +
=== Multiple Codes ===
 +
* [http://www.codecademy.com/learn Codecademy]
 +
: Learn to code while building a project. Free online courses include: HTML & CSS; jQuery; JavaScript; PHP; Python; Ruby; Web Projects; APIs
 +
 +
=== R ===
 +
* [https://www.datacamp.com/ DataCamp]
 +
: Learn R & Become a Data Analyst
 +
 +
=== D3.js ===
 +
* [https://www.youtube.com/user/d3Vienno d3Vienno]
 +
: '''d3Vienno''' features a series of video tutorials, each about 10-12 minutes long, on using D3.js.
 +
 +
* [http://gisciencegroup.ucalgary.ca/engo500/texts/Interactive_Data_Visualization_for_the_Web.pdf Interactive Data Visualization for the Web: An Introduction to Designing with D3]
 +
 +
; D3 Tools
 +
* [http://nytimes.github.io/svg-crowbar/ SVG Crowbar]
 +
: A Chrome-specific bookmarklet that extracts SVG nodes and accompanying styles from an HTML document and downloads them as an SVG file—A file which you could open and edit in Adobe Illustrator, for instance. Because SVGs are resolution independent, it’s great for when you want to use web technologies to create documents that are meant to be printed (like, maybe on newsprint). It was created with d3.js in mind, but it should work fine no matter how you choose to generate your SVG.
  
 
== Data Analysis Tools ==
 
== Data Analysis Tools ==
 +
* [http://d3js.org/ D3.js - Data-Driven Documents]
 +
: '''D3.js''' is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.
 +
 +
* [https://www.rstudio.com/ide/download/ Download RStudio]
 +
: Take control of your R code. RStudio is the premier integrated development environment for R. It is available in open source and commercial editions and runs on the desktop (Windows, Mac, and Linux) or over the web with RStudio Server. Download RStudio (for Windows, Mac, or Linux).
 +
 
* [http://openrefine.org/ OpenRefine] (Formerly [http://code.google.com/p/google-refine/Google Refine].)
 
* [http://openrefine.org/ OpenRefine] (Formerly [http://code.google.com/p/google-refine/Google Refine].)
'''[[wikipedia:OpenRefine|Open Refine]]''' is a standalone open source desktop application for data cleanup and transformation to other formats, the activity known as [[wikipedia:Data wrangling|data wrangling]]. It is similar to [[wikipedia:Spreadsheet|spreadsheet]] applications (and can work with spreadsheet file formats), however, it behaves more like a database.
+
: '''[[wikipedia:OpenRefine|Open Refine]]''' is a standalone open source desktop application for data cleanup and transformation to other formats, the activity known as [[wikipedia:Data wrangling|data wrangling]]. It is similar to [[wikipedia:Spreadsheet|spreadsheet]] applications (and can work with spreadsheet file formats), however, it behaves more like a database.
  
 
* [http://vis.stanford.edu/wrangler/ Data Wrangler] (Stanford Visualization Group)
 
* [http://vis.stanford.edu/wrangler/ Data Wrangler] (Stanford Visualization Group)

Revision as of 18:21, 6 April 2014

Command Line

Mounting shared drives and connecting to remote resources is something you can easily do from the Windows GUI. With our quick guide to the command prompt, however, you can more easily automate large tasks.
Quick super fast course in using the command line. It is intended to be done rapidly in about a day or two, and not meant to teach you advanced shell usage.

WGET

GNU Wget is a free network utility to retrieve files from the World Wide Web using HTTP and FTP, the two most widely used Internet protocols. It works non-interactively, thus enabling work in the background, after having logged off.
Say you want to backup your blog or create a local copy of an entire directory of a web site for archiving or reading later. The command: wget -m http://website.tld
Tips for mirroring specific directories, update only changed files, etc.
Download files using curl or wget. This addon generates curl/wget commands that emulate the request as though it's coming from your browser allowing you to download protected files directly to a separate machine (e.g. server).

Windows Command Line

Cmdlets are the heart-and-soul of Windows PowerShell, Microsoft's latest command shell/scripting language.
Robocopy: A robust file copy command for the Windows command line.
Each command is linked to more info about the particular command.
The xcopy command is a Command Prompt command used to copy one or more files and/or folders from one location to another location.

Generate File List

Example: C:\Users\me\Downloads\MyFolder> dir /b > filelist.txt
(The text in orange shows the command used once you've navigated into the directory from which you want to generate the list of file names.)
This tutorial contains several working answers for using Windows PowerShell to list files and folders.
Works with cmd.exe, but doesn't seem to work with Windows PowerShell.

Folder & File Compression

Example: for /d %%X in (*) do "c:\Program Files\7-Zip\7z.exe" a "%%X.zip" "%%X\"
To compress a folder without using any particular compression software.

Text Extraction

Capture2Text enables users to do the following:
  1. Optical Character Recognition (OCR)
  2. Speech Recognition
Lists several options.
Extracts plain text from documents in all popular formats.
Detexter is an app designed to extract text from PDF files.

Data Scrape

Tools and tips compiled by journalists from PBS and Omaha World-Herald.
Scrapinghub's list of open source scraping projects.
Tools for gathering data from public sources.

Text Search

Makes tools to search text content, including:
  1. FALCON - Text Search Java Project: JSON based text search Java Project
  2. HAWK - PDF Text Search Java Project: Taking initiative for Document Text Search
Xpdf is an open source viewer for Portable Document Format (PDF) files.
Windows installer: Short Programs/Scripts (Look for the xpdf3.exe / poppler.exe links in left sidebar.)

PDF Conversion

Convert PDF to HTML without losing text or format.

Educate Yourself

Multiple Codes

Learn to code while building a project. Free online courses include: HTML & CSS; jQuery; JavaScript; PHP; Python; Ruby; Web Projects; APIs

R

Learn R & Become a Data Analyst

D3.js

d3Vienno features a series of video tutorials, each about 10-12 minutes long, on using D3.js.
D3 Tools
A Chrome-specific bookmarklet that extracts SVG nodes and accompanying styles from an HTML document and downloads them as an SVG file—A file which you could open and edit in Adobe Illustrator, for instance. Because SVGs are resolution independent, it’s great for when you want to use web technologies to create documents that are meant to be printed (like, maybe on newsprint). It was created with d3.js in mind, but it should work fine no matter how you choose to generate your SVG.

Data Analysis Tools

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.
Take control of your R code. RStudio is the premier integrated development environment for R. It is available in open source and commercial editions and runs on the desktop (Windows, Mac, and Linux) or over the web with RStudio Server. Download RStudio (for Windows, Mac, or Linux).
Open Refine is a standalone open source desktop application for data cleanup and transformation to other formats, the activity known as data wrangling. It is similar to spreadsheet applications (and can work with spreadsheet file formats), however, it behaves more like a database.
Wrangler allows interactive transformation of messy, real-world data into the data tables analysis tools expect. Export data for use in Excel, R, Tableau, Protovis, ...
HTSQL is designed for data analysts and other accidental programmers who have complex business inquiries to solve and need a productive tool to write and share database queries. HTSQL is free and open source software.
Jigsaw is a visual analytics system to help analysts and researchers better explore, analyze, and make sense of such document collections.

Maintenance

Boot Disks

Rufus is an utility that helps format and create bootable USB flash drives, such as USB keys/pendrives, memory sticks, etc.

Network Issues

Post reviews various "fixes" found all over the web, and which "fix" actually worked for the post's author.
Path MTU Discovery (PMTUD) in Windows just doesn’t seem to figure out the MTU for a given path. So Windows uses the default. For the most part this doesn’t affect anyone. But failure of PMTUD will result in some websites not loading correctly, having trouble connecting to normally reliable online services and general Internet weirdness.

Google Drive

Select what you want to strike and click Alt+Shift+5. (Option+Shift+5 for Mac).
Ctrl+? to see other such keyboard shortcuts.

See Also