Mrli
别装作很努力,
因为结局不会陪你演戏。
Contacts:
QQ博客园

fork curlconverter for Better

2020/12/01 GitCode
Word count: 1,891 | Reading time: 9min

终于12月了,又是一个月初,定个flag,pr一个Gitcode

由于之前写Python爬虫的时候,从curlconverter受惠很多,并且看到issue #22: Add generator for Java中有Java版本的需求还未完成,所以目标就是完成JAVA for curl

curlconverter提供了一个网页版: https://curl.trillworks.com/, 其实其本质上也是用js完成的,功能如README中所写:

README

Install

1
$ npm install --save curlconverter

Usage

1
2
3
var curlconverter = require('curlconverter');

curlconverter.toPython("curl 'http://en.wikipedia.org/' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Referer: http://www.wikipedia.org/' -H 'Cookie: GeoIP=US:Albuquerque:35.1241:-106.7675:v4; uls-previous-languages=%5B%22en%22%5D; mediaWiki.user.sessionId=VaHaeVW3m0ymvx9kacwshZIDkv8zgF9y; centralnotice_buckets_by_campaign=%7B%22C14_enUS_dsk_lw_FR%22%3A%7B%22val%22%3A%220%22%2C%22start%22%3A1412172000%2C%22end%22%3A1422576000%7D%2C%22C14_en5C_dec_dsk_FR%22%3A%7B%22val%22%3A3%2C%22start%22%3A1417514400%2C%22end%22%3A1425290400%7D%2C%22C14_en5C_bkup_dsk_FR%22%3A%7B%22val%22%3A1%2C%22start%22%3A1417428000%2C%22end%22%3A1425290400%7D%7D; centralnotice_bannercount_fr12=22; centralnotice_bannercount_fr12-wait=14' -H 'Connection: keep-alive' --compressed");

Returns a string of Python code like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import requests

cookies = {
'GeoIP': 'US:Albuquerque:35.1241:-106.7675:v4',
'uls-previous-languages': '%5B%22en%22%5D',
'mediaWiki.user.sessionId': 'VaHaeVW3m0ymvx9kacwshZIDkv8zgF9y',
'centralnotice_buckets_by_campaign': '%7B%22C14_enUS_dsk_lw_FR%22%3A%7B%22val%22%3A%220%22%2C%22start%22%3A1412172000%2C%22end%22%3A1422576000%7D%2C%22C14_en5C_dec_dsk_FR%22%3A%7B%22val%22%3A3%2C%22start%22%3A1417514400%2C%22end%22%3A1425290400%7D%2C%22C14_en5C_bkup_dsk_FR%22%3A%7B%22val%22%3A1%2C%22start%22%3A1417428000%2C%22end%22%3A1425290400%7D%7D',
'centralnotice_bannercount_fr12': '22',
'centralnotice_bannercount_fr12-wait': '14',
}

headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Referer': 'http://www.wikipedia.org/',
'Connection': 'keep-alive',
}

response = requests.get('http://en.wikipedia.org/', headers=headers, cookies=cookies)

想要参加开发,最重要的就是看清master提出的contributing要求:

Contributing

I’d rather write programs to write programs than write programs.

Dick Sites, Digital Equipment Corporation, September 1985

Make sure you’re running node 12 or greater. The test suite will fail on older versions of node.

If you add a new generator, make sure to update the list of supported languages in cli.js or else it won’t be accessible from the command line. Further, you’ll want to update test.js and index.js for your new generator to make it part of the testing.

If you want to add new functionality, start with a test.

  • Create a file containing the curl command in fixtures/curl_commands with a descriptive filename like post_with_headers.txt
  • Create a file containing the output in fixtures/python_output/ with a matching filename (but different extension) like post_with_headers.py
  • Run tests with npm test.
  • If your filenames match correctly, you should see one failing test. Fix it by modifying the parser in util.js or the generators in generators/

The parser generates a generic data structure consumed by code generator functions.

You can run a specific test with this command:

1
node test.js --test=test_name

where “test_name” is a file (without extension) in fixtures/curl_commands

You can run a specific test with this command:

1
node test.js --language=R

I recommend setting this up with a debugger so you can see exactly what the parser is passing to the generator. Here’s my Intellij run configuration for a single test:

Before submitting a PR, please check that your JS code conforms to the code style enforced by standardjs. Use the following to fix your code if it doesn’t:

1
$ standard --fix my_file.js

If you get stuck, please reach out via email. I am always willing to hop on a google hangout and pair program.

翻译:

相比写程序,我更愿意写出能写程序的程序。——Dick Sites, Digital Equipment Corporation, September 1985

确保你正在运行node 12或更高版本。测试套件在旧版本的node上会失败。

如果你添加了一个新的生成器,请确保更新cli.js中的支持语言列表,否则它将无法从命令行访问。此外,你还要为你的新生成器更新test.js和index.js,使其成为测试的一部分。

如果您想添加新功能,请从测试开始。

  • fixtures/curl_commands中创建一个包含curl命令的文件,文件名为post_with_headers.txt。
  • fixtures/python_output/中创建一个包含输出的文件,并使用一个匹配的文件名(但不同的扩展名),比如post_with_headers.py
  • npm test运行测试。
  • 如果您的文件名正确匹配,您应该看到一个失败的测试。通过修改util.js中的解析器或generators/中的生成器来解决。

解析器需要生成一个通用数据结构给代码生成器函数。

你可以用这个命令运行一个特定的测试。node test.js --test=test_name, 其中 "test_name "是fixtures/curl_commands中的一个文件(没有扩展名)。

你可以用这个命令运行一个特定的测试。node test.js --language=R,我建议用调试器来设置,这样你就可以看到解析器传递给生成器的具体内容。这是我的Intellij运行配置,用于单次测试。

Screenshot of intellij debug configuration

在提交PR之前,请检查您的JS代码是否符合standardjs执行的代码风格。如果不符合的话,请使用下面的方法来修正你的代码。$ standard --fix my_file.js.

如果你遇到困难,请通过电子邮件联系我。我总是愿意跳上google上线、校验程序。


分析其他contributor的PR

DainisGorbunovs


JAVA for Curl

http库选取

在2016年的时候**NickCarneiro**就Add generator for Java #22提出增加JAVA版本, 但无奈JAVA较为繁琐,并且没有像Python的requests一样好用的库,因此现在都还未有JAVA版本

Java is super popular and super verbose, making it a good candidate for curlconverter.
We need to find out if there is some modern library for sending http requests. Please advise.

1、HttpClient

HttpClient:代码复杂,还得操心资源回收等。代码很复杂,冗余代码多,不建议直接使用。

HttpClient使用介绍

使用HttpClient发送请求主要分为以下几步骤:

  • 创建 CloseableHttpClient对象或CloseableHttpAsyncClient对象,前者同步,后者为异步
  • 创建Http请求对象
  • 调用execute方法执行请求,如果是异步请求在执行之前需调用start方法
  1. java原生HttpURLConnection
  2. apache HttpClient3.1
  3. apache httpClient4.5

上述见:java实现HTTP请求的三种方式—— 有代码demo

2、okhttp

okhttp:OkHttp是一个高效的HTTP客户端,允许所有同一个主机地址的请求共享同一个socket连接;连接池减少请求延时;透明的GZIP压缩减少响应数据的大小;缓存响应内容,避免一些完全重复的请求

OkHttp使用

使用OkHttp发送请求主要分为以下几步骤:

  • 创建OkHttpClient对象
  • 创建Request对象
  • 将Request 对象封装为Call
  • 通过Call 来执行同步或异步请求,调用execute方法同步执行,调用enqueue方法异步执行

3、RestTemplate

RestTemplate: 是 Spring 提供的用于访问Rest服务的客户端, RestTemplate 提供了多种便捷访问远程Http服务的方法,能够大大提高客户端的编写效率。

https://www.cnblogs.com/zk-blog/p/12465951.html

4.http-request

在我们日常工作中,我们需要经常和第三方接口进行交互通信,很多时候我们都是使用http协议进行交互,java原生自带对http的支持(java.net.*),但是使用起来不太方便,除此之外,用的最多的就是apache httpclient工具包。但是个人使用这么久而言,感觉不管是原生的抑或是httpclient,使用起来都不太顺手,也略显复杂

文章见: JAVA http请求工具类http-request

Author: Mrli

Link: https://nymrli.top/2020/12/01/fork-curlconverter-for-Better/

Copyright: All articles in this blog are licensed under CC BY-NC-SA 3.0 unless stating additionally.

< PreviousPost
浙大2020春夏-人工智能习题3——AIforOthello
NextPost >
《漫画机器学习入门》大关真之——读书笔记
CATALOG
  1. 1. README
    1. 1.1. Install
    2. 1.2. Usage
    3. 1.3. Contributing
      1. 1.3.1. 翻译:
  2. 2. 分析其他contributor的PR
    1. 2.1. DainisGorbunovs
  3. 3. JAVA for Curl
    1. 3.1. http库选取
      1. 3.1.1. 1、HttpClient
        1. 3.1.1.1. HttpClient使用介绍
      2. 3.1.2. 2、okhttp
      3. 3.1.3. 3、RestTemplate
      4. 3.1.4. 4.http-request