Category Archives: php

GroupMe Bot

I wanted to interface with GroupMe and I was hoping they have an API. Not only is there an API but they have an entire system for making Bots.

In one group I’m a part of, the members kept changing their names and it was hard to keep track who really was who.

This bot has no memory, thus when it sees someone change their name it will download all the messages ever posted then process who’s who. The group I tested this on had 2500+ messages, which takes about 5-10 seconds to process. For larger groups, and as smaller groups continue, it would be best to cache messages instead of re-lookup all past messages.

<?php

$group_id = 'YOUR_GROUP_ID_THE_BOT_IS_ASSIGNED_TO';
$token = 'YOUR_TOKEN';
$bot_id = 'YOUR_BOT_ID';

$data = json_decode(file_get_contents('php://input'));

if($data != null){
	if($data->system != true){
		exit;
	}
	if(strpos($data->text, 'changed name') === false){
		exit;
	}
	
	preg_match('/(.+?) changed name to (.+)/', $data->text, $matches);
	$new_name_change = $matches[2];
}


// download all the messages
$url_base = "https://api.groupme.com/v3/groups/".$group_id."/messages?token=".$token."&limit=100&";
$before_id = '';
$messages = array();

do {
	if($before_id != ''){
		$url = $url_base.'before_id='.$before_id;
	} else {
		$url = $url_base;
	}
	
	$response = json_decode(file_get_contents($url));
	if(empty($response)){
		// no response, no results
		break;
	}
	$response = $response->response;
	
	$count = $response->count;
	$messagesNew = $response->messages;
	$messages = array_merge($messages, $messagesNew);
	
	$before_id = end($messagesNew)->id;
} while($count > 0);

// sort only the system messages for name changes
$system = array();
$system_all = array();
foreach($messages as $message){
	if($message->system == true){
		$system_all[$message->created_at] = $message->text;
		if(strpos($message->text, 'changed name') !== false){
			$system[$message->created_at] = $message->text;
		}
	}
}

// build maps
$system = array_reverse($system);

// overrides
$overrides = array(
	// 'correct real name' => 'name when joined'
);

$map = $overrides;
foreach($system as $change){
	preg_match('/(.+?) changed name to (.+)/', $change, $matches);
	$orig = $matches[1];
	$new = $matches[2];

	if(in_array($orig, array_values($map))){
		$keys = array_keys($map, $orig);
		$key = $keys[0];
		$map[$key] = $new;
		
		if($key == $new){
			unset($map[$key]);
		}
		continue;
	}

	if(!isset($map[$orig])){
		$map[$orig] = $new;
		continue;
	}
}

if(!empty($new_name_change)){
	$keys = array_keys($map, $new_name_change);
	if(empty($keys)){
		// user changed their name back to their real name
		exit;	
	}
	$name = $keys[0];
	
	// build url
	$message = urlencode($new_name_change.'\'s real name is '.$name);
	$url = 'https://api.groupme.com/v3/bots/post?bot_id='.$bot_id.'&text='.$message;//.'&token='.$token;
	
	// send message
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_POST, true);
	curl_setopt($ch, CURLOPT_POSTFIELDS, array());
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
	$result = curl_exec($ch);
}

?>

Asynchronous URL requests with php and cURL

I’m scraping a website for data. However, the content I need appears at random, so each page load will show a different sentence. I need all the unique sentences that website will display.

How to do this? Like I’ve done many times before: curl_multi_exec. Except I never remember how to do it and end up back at the manuals.

Here’s my code:

<?php

$data = array();
$url = "http://www.example.com/";
$mh = curl_multi_init();

// I loop forever and watch the output
// put your condition to stop looping here
while(true){
	// create all cURL resources
	$chs = array();
	// number of consecutive requests sent out
	// from experience I find 20-40 to be the fastest
	// but you should experiment and find out yourself
	for($i = 0; $i < 10; $i++){
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, $url);
		curl_setopt($ch, CURLOPT_HEADER, 0);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_multi_add_handle($mh, $ch);
		$chs[] = $ch;
	}

	$active = null;
	//execute the handles
	// this loop looks weird because php 5.3.18 broke curl_multi_select
	do {
		do {
		    $mrc = curl_multi_exec($mh, $active);
		} while ($mrc == CURLM_CALL_MULTI_PERFORM);
		// this fixes the multi select from returning -1 forever
		usleep(10000);
	} while(curl_multi_select($mh) === -1);

	while ($active && $mrc == CURLM_OK) {
	    if (curl_multi_select($mh) != -1) {
	        do {
	            $mrc = curl_multi_exec($mh, $active);
	        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
	    }
	}

	//close the handles
	foreach($chs as $ch){
		$html = curl_multi_getcontent($ch);
		curl_multi_remove_handle($mh, $ch);

		// parse here...
		if(!preg_match_all('/pattern/', $html, $matches)){
			echo "Error!";
		}
		
		// store matches as keys so I can find the unique ones
		// bonus: increment the value to count how many times you find that match
		foreach($matches[1] as $match){
			if(!isset($data[$match])){
				$data[$match] = 1;
				echo "+: ".$match."\n";
			} else {
				$data[$match]++;
				echo " : ".$match."\n";
			}
		}
	}
}

curl_multi_close($mh);

?>

I used the php manual, start here. And special thanks to an ‘Alex Palmer’ for his comment on how to fix the curl_multi_select issue, and ‘bfanger at gmail dot com’ for his quick solution.

Originally curl_multi_select will block until something happens, like a url returns, but now its returning -1 doesn’t matter what. Alex posted a bug report, where bfanger mentions to add a pause before checking multi select again.